# Risk, Insurance, and the Actuary

Risk is an inherent part of our daily life. As a result, most of us, take out insurance policies as a means of protection against scenarios which, were they to occur, may cause hardship whether for us or, as in the case of life insurance, for our families.

Insurance companies write many types of policies. The mutual risks of the policy holders are shared so that claims made against the policies can be covered at a much reduced cost. If priced fairly, then the premium reflects the contribution of the insured’s risk to overall risk.

As policy holders – we want the best price to cover the risk we are offloading; shareholders (again often us if we have superannuation)of the insurance company –require the premiums be sufficient to ensure the company stays in business.

It is then very important that analysts pricing the policies (and those calculating the required level of capital to meet the claim liabilities) have the statistical knowledge necessary to measure risk accurately! Understanding risk is even more critical in the framework of Solvency II (*) capital requirements (if it ever gets enforced).

The task is made more difficult as the duration of the policy life varies considerably. Some insurance cover is claimed against shortly after the incident occurs with a short processing time – automobile accidents for instance typically fit this category. This class of cover is termed **short-tail liabilities** as payments are completed within a short timeframe of the incident occurring.

Other cases arise many years after the original policy was taken out, or payments may occur many years after the original claim was raised – for example medical malpractice. These are termed **long-tail liabilities** as payments may be made long after the original policy was activated or the incident occurred. Due to the long forecast horizon and [generally] higher volatility in the claim amounts, long-tail liabilities are inherently more risky.

Life insurance is in its own category as everybody dies sometime.

## Meet the data

For convenience, and because it is generally less well understood, we restrict our focus to long-tail liability insurance data

For each claim we have many attributes, but four that are universal to all claims: payment amount(s), incident date (when the originating event resulting in the claim occurred), payment date(s), and state of claim (are further payments possible or is the claim settled). These attributes allow the aggregation of the individual claim data into a series more amenable for analysis at the financial statement level where the volatility of individual claims should be largely eliminated since the risk is pooled.

Actuaries tend to present their data cumulatively in a table like this:

Where the rows are accident years, and the column index (development time in actuarial parlance) is the delay between the accident year and the year of payment.

Thus payments made in development lag 0 corresponds to all payments made toward claims in the year the accident occurred. The values in development lag 10 correspond to the sum of the payments made in the eleven years since the accident occurred.

This presentation likely arose for a number of reasons, but the most important two being:

- Cumulative data are much easier to work with in the absence of computers;
- Volatility is visibly less of an issue the further in the development tail when examining cumulatives.

The nature of the inherited data presentation produces some unfortunate consequences:

- Variability is hard to quantify between parameter uncertainty and process volatility;
- Calendar year effects (trends down the diagonals) are unable to be measured – and therefore readily predicted;
- Parameter interpretation is difficult due to the calendar year confounding effects; and
- Parsimony is hard to achieve.

The actuarial profession attempts to deal with each of these issues in various ways. For instance, the bootstrap is being used to quantify variability. Data may be indexed against inflation to partially account for calendar year trends.

## Why spend time on this?

Fundamentally because, if you want to solve a problem, you first have to be sure that the data you are using and the way you are using it allows you to solve the problem! The profession has spent much time, energy, and analysis on developing techniques to solve the risk measurement problem but with the underlying assumption that cumulation is the way to analyse insurance data.

Aside: this is why I enjoy Genetic Programming – not because the algorithm allows the automatic generation of solutions, but rather because you have to formulate the problem very precisely in order to ensure the right problem is solved.

## Understanding the problem

The objective of analysis of the Insurance portfolios is to quantify the expected losses incurred by the Insurance company and the volatility (the risk) associated with the portfolio so adequate money is raised to pay all liabilities, at a reasonable price, with an excellent profit. Additional benefits may arise like an improved understanding of the policies being written, targeting of more profitable customers, and so forth, but these are secondary.

Assume the data available are the loss data with the three attributes of accident time, calendar time, and payment. Forget about claim state for now though this is an important factor for future projections.

We immediately identify two time attributes. This suggests time series models are likely a good starting point for analysis. We also would examine the distribution(s) of incremental losses rather than cumulate the losses over time since cumulation of time series would hide the volatility of the losses at the individual time points – the very component that we are interested in.

Further, we need the ability to distinguish between parameters, parameter uncertainty, and the process volatility. Process volatility and parameter uncertainty drive the critical risk metrics which are essential to ensuring adequate capital is set aside to not only cover the expected losses, but also allow for the unexpected losses should they occur.

Beginning with this foundation, modelling techniques which take the fundamental time-series nature of the data into account are almost certain to provide superior performance to methodologies which mask (for historical reasons mentioned) the time series nature of the data.

## Is this new?

Actually, no. All the above considerations of analysis of P&C insurance data were presented many years ago. However, time series approaches are not typically taught to aspiring P&C actuaries. Why?

Perhaps several reasons:

- Tradition. Like any specialised profession, a system is developed to provide solutions and unless the system is convincingly broken, the uptake of new methodology is resisted.
- Statistical analysis is complicated. Applying standard formula to get answers is “easy” when you know the formula.

## The catch

Misrepresenting data leads to a flawed model representing the underlying data processes.

The likelihood of such a methodology resulting in the correct mean or a correct measure of the volatility is extremely low. The distributional assumptions are likely completely spurious as the fundamental nature of the data is not recognised.

## Wrong model = wrong conclusion, unless you’re unlucky

It is often a general problem where the wrong statistical technique is applied to solve a statistical problem. This suggestion the statement: “All models are wrong, but some are useful.” This is not entirely fair in my mind as it (wrongly) places the blame on the model where the blame should actually be on the analyst and their choice of the modelling method.

Although we will never find **the** model driving the underlying data generating process, nevertheless, we can often well approximate the data process (otherwise modelling of any kind would be pointless). These are the useful models. Then you are only unlucky if your model looks like it is useful, but fails when it comes to prediction.

## In summary

- The problem of quantifying risk is not a simple exercise
- Insurance data is fundamentally financial time series data
- The right starting point is critical to any statistical analysis
- We statisticians need to explain our solutions in a way that is meaningful to established professions

(*) In essence, Solvency II comprises insurance legislation aiming to improve policyholder protection by introducing a clear, comprehensive framework for a market consistent, risk model. In particular, insurance companies must be able to withstand a 1/200 year loss event in the next calendar year encompassing all levels of risk sources – insurance and reserve risk, catastrophe risk, operational risk, default risk to name a few. Quantitative impact study documents are available here; a general discussion of Solvency II can be found here. The legislation has been postponed many times.

**About David Munroe**

David Munroe leads Insureware’s outstanding statistical department. Comments in this article are the authors own and do not necessarily represent the position of Insureware Pty Ltd.

He completed an Masters degree in Statistics (with First Class Honours) from Massey University, New Zealand.

David has experience in statistical and actuarial analysis along with C++ programming knowledge. Previous projects include working with a Canadian Insurance company to software training and implementation purposes resulting in significant modelling improvements (regions can be modelled within a working day allowing analysts to focus on providing extracted insights to management).

David studied the art of Shaolin Kempo for over nine years, holds a second degree black belt, and is qualified in the use of Okinawan weaponry. He is also interested in music (piano), literature, photography, and self sufficiency. He also has two children on the autism spectrum.