Multivariate Classification Ratemaking Methods and Basic Principles of Generalized Linear Models: Practice Questions and Solutions

The Actuary's Free Study Guide for Exam 5 - Section 81

G. Stolyarov II
This section of sample problems and solutions is a part of The Actuary's Free Study Guide for Exam 5, authored by Mr. Stolyarov. This is Section 81 of the Study Guide. See an index of all sections by following the link in this paragraph.

This section of the study guide is intended to provide practice problems and solutions to accompany the pages of Basic Ratemaking, cited below. Students are encouraged to read these pages before attempting the problems. This study guide is entirely an independent effort by Mr. Stolyarov and is not affiliated with any organization(s) to whose textbooks it refers, nor does it represent such organization(s).

Some of the questions here ask for short written answers based on the reading. This is meant to give the student practice in answering questions of the format that will appear on Exam 5. Students are encouraged to type their own answers first and then to compare these answers with the solutions given here. Please note that the solutions provided here are not necessarily the only possible ones.

Source:
Werner, Geoff and Claudine Modlin. Basic Ratemaking. Casualty Actuarial Society. 2009. Chapter 10, pp. 171-176.

Original Problems and Solutions from The Actuary's Free Study Guide

Problem S5-81-1. Name three advantages of multivariate classification ratemaking methods over univariate methods.

Solution S5-81-1. The following advantages of multivariate classification ratemaking methods are discussed by Werner and Modlin, pp. 171-172.

1. Multivariate methods "consider all rating variables simultaneously and automatically adjust for exposure correlations between rating variables."

2. Multivariate methods "allow for the nature of the random process" and are able, in many cases, to largely extract the "signal" (systematic patterns in the data) from the "noise" (unsystematic patterns).

3. Multivariate methods "produce model diagnostics, additional information about the certainty of results and the appropriateness of the model fitted."

4. Multivariate methods "allow consideration of the interaction, or interdependency, between two or more rating variables."

Any three of the above suffice as an answer. Other valid answers may also be possible.

Problem S5-81-2. What is the difference between an exposure correlation and a response correlation (interaction)?

Solution S5-81-2. Exposure correlation refers to the uneven distribution of categories of one rating variable within the categories of another variable. An interaction occurs "when the effect of one variable varies according to the levels of another" (Werner and Modlin, p. 172). In exposure correlation, one variable does not have to affect the other; there just needs to be an uneven distribution of exposures such that a univariate analysis will be distorted. In response correlation, one variable influences the other.

Problem S5-81-3.

(a) Which two assumptions of linear models (LMs) do generalized linear models (GLMs) remove?

(b) In GLMs, what does a link function accomplish?

(c) According to Werner and Modlin, p. 173, what are the three steps needed to solve a GLM?

Solution S5-81-3. This question is based on the discussion in Werner and Modlin, p. 173.

(a) GLMs remove the assumptions of normality (i.e., that the underlying random variable follows a Normal distribution) and of constant variance (i.e., that the variance of the error term ε is always the same throughout the distribution of the underlying random variable.

(b) A link function enables the modeler "to define the relationship between the expected response variable (e.g., claim severity) and the linear combination of the predictor variables (e.g., age of home, amount of insurance, etc.)" (Werner and Modlin, p. 173).

(c) The following three steps needed to solve a GLM are given by Werner and Modlin, p. 173:

1. "Supply a modeling dataset with a suitable number of observations of the response variable and associated predictor variables to be considered for modeling."

2. "Select a link function to define the relationship between the systematic and random components."

3. "Specify the distribution of the underlying random process, typically a member of the exponential family of distributions (e.g., normal, Poisson, gamma, binomial, inverse Gaussian); this is done by specifying the mean and the variance of the distribution, the latter being a function of the mean."

Problem S5-81-4. Give three reasons for why GLMs typically analyze loss cost data, as opposed to loss ratio data.

Solution S5-81-4. The following reasons for why GLMs typically analyze loss cost data instead of loss ratio data are given by Werner and Modlin, p. 173:

1. "Modeling loss ratios requires premiums to be adjusted to current rate level at the granular level, and that can be practically difficult."

2. "Experienced actuaries have an a priori expectation of frequency and severity patterns (e.g., youthful drivers have higher frequencies). In contrast, the loss ratio patterns are dependent on the current rates. Thus, the actuary can better distinguish the signal from the noise when building models."

3. "Loss ratio models become obsolete when rates and rating structures are changed."

4. "There is no commonly accepted distribution for modeling loss ratios."

Any three of the above suffice as an answer. Other valid answers may also be possible.

Problem S5-81-5. An actuary developing rates for a insurance product that protects homeowners from anvils falling from the sky uses territory, amount of insurance, and roof age as rating variables. The actuary makes the observation that the univariate (one-way) relativity for Territory Q is significantly higher than the GLM-indicated relativity. Assuming that the GLM was properly set up, which of the following scenarios could account for this observation?

(a) Territory Q consists primarily of smaller homes that would suffer less total damage if an anvil were to fall on them.

(b) Territory Q was populated earlier than most territories and so contains predominantly older homes where the roofs have not been reinforced to protect against falling anvils.

(c) The univariate analysis failed to capture the considerable risk-aversion of homeowners in Territory Q, which was taken into account by the GLM.

(d) Territory Q is predominantly inhabited by more reckless homeowners who jump on their roofs regularly, making them more vulnerable to anvil damage.

(e) None of the above scenarios could account for this observation.

Solution S5-81-5.

The correct answer is (b). A GLM considers the effects of a given variable in the context of all the other variables. So the higher univariate relativity for Territory Q suggest that the univariate analysis attributed to a home's presence in Territory Q risk characteristics that were in fact due to another factor, such as greater roof age. A GLM, by distinguishing the effect of territory from the effect of roof age, would produce a lower relativity for Territory Q, compared to the relativity obtained via the univariate analysis.

Answer (a) is not correct; if it were, the GLM-indicated relativities would be higher than the univariate relativities for Territory Q, since the smaller home size (and, by implication, smaller amount of insurance) would have somewhat reduced the visible "signal" from territory-based phenomena.

Answer (c) is not correct; if homeowners in Territory Q are indeed more risk-averse than elsewhere, both the univariate analysis and the GLM would have captured this. The two analyses are presumably based on the same underlying data, and the risk-aversion of homeowners would be difficult to disentangle from the territory variable and to identify as a separate variable (as there are few objective, independently verifiable measures of risk-aversion).

Answer (d) is not correct, since the recklessness of homeowners is not a separate rating variable, and so its effects would not be disentangled from the effects of territory by a GLM. GLMs can provide information regarding interactions among variables that have been defined, but they cannot account for interactions within a variable (i.e., what, aside from age of room and amount of insurance in this model, would explain why Territory Q has a relativity that differs in a certain way from the relativities of the other territories).

See other sections of The Actuary's Free Study Guide for Exam 5.

Published by G. Stolyarov II

G. Stolyarov II is a science fiction novelist, independent essayist, poet, amateur mathematician, composer, author, and actuary.  View profile

To comment, please sign in to your Yahoo! account, or sign up for a new account.