Exam-Style Questions on the Method of Moments, Significance Levels, and the Neyman-Pearson Lemma: Practice Problems and Solutions

The Actuary's Free Study Guide for Exam 3L - Section 54

G. Stolyarov II
This section of sample problems and solutions is a part of The Actuary's Free Study Guide for Exam 3L, authored by Mr. Stolyarov. This is Section 54 of the Study Guide. See an index of all sections by following the link in this paragraph.

Following Larsen and Marx 2006, we describe the method of moments as follows.

"Let y1, y2, ..., yn be a random sample from the continuous p.d.f. fY(y; θ1, θ2, ... ,θs). The method of moments estimates θ1e, θ2e, ... ,θse for the model's unknown parameters are the solutions of the s simultaneous equations

-∞∫y*fY(y; θ1, θ2, ... ,θs)dy = (1/n)i=1n∑yi

-∞∫y2*fY(y; θ1, θ2, ... ,θs)dy = (1/n)i=1n∑yi2

...

-∞∫ys*fY(y; θ1, θ2, ... ,θs)dy = (1/n)i=1n∑yis

"If the underlying random variable is discrete with p.d.f. pX(k; θ1, θ2, ... ,θs), the method of moments estimates are the solutions of the system of equations

all k∑kj* pX(k; θ1, θ2, ... ,θs) =(1/n)all k∑yik" (357-358).

Let M be the random variable representing the values in our sample and X be the random variable representing the values given by the distribution we are considering. According to the method of moments, the system of equations that can be set up to solve for our s unknown parameters can be conceived of as follows:

E[X] = E[M].

E[X2] = E[M2]

...

E[Xs] = E[Ms]

This is a more convenient way to remember the method of moments and is especially useful when one has memorized the formulas for the moments of the distribution with which one is working.

A single-parameter Pareto distribution is different from the two-parameter Pareto distribution introduced in Section 47.

The single-parameter Pareto distribution has survival function

s(x) = 1 - (θ/x)α, for some given value of θ and for x > θ.

The single-parameter Pareto distribution only describes values of x larger than θ. For instance, θ could be some minimum claim or loss size.

The mean of a single-parameter Pareto distribution is

E[X] = θα/(α - 1).

Broverman defines a significance level for a hypothesis test as "the probability of rejecting the hypothesis H0, given that H0 is true." In problems involving significance levels, you will be given some critical region for values within which the hypothesis H0 must be rejected. Then you will be asked to find the conditional probability that, if H0 is true, you will find a value within that critical region.

The Neyman-Pearson Lemma is used to perform a hypothesis test between two point hypotheses: H0: θ = θ0 and H1: θ = θ1.

Hypothesis H0 is rejected in favor of H1 when

Λ(x) = L(θ0│x)/L(θ1│x) ≤ η, where P(Λ(x) ≤ η│H0) = α.

In the context of Exam 3L, you will likely be asked to analyze a hypothesis test under the following conditions.

A certain significance level will be given, as a percentage or probability.

You will be given a multiplicity of points and asked to determine which points are within the critical region defined by the given significance level.

The answer to such a question is as follows. A point is within the critical region defined by a given significance level k if P(N = n │ θ = θ0) < k.

So to answer a problem involving this application of the Neyman-Pearson Lemma, it is necessary to consider the probability distribution of N, given that θ = θ0. You will typically be given such a distribution, so you will simply need to examine the conditional probability for each n, given that θ = θ0, to find whether this probability is less than k.

Source: Broverman, Sam. Actuarial Exam Solutions - CAS Exam 3 - Fall 2006.

Larsen, Richard J. and Morris L. Marx. An Introduction to Mathematical Statistics and Its Applications. Fourth Edition. Pearson Prentice Hall: 2006. pp. 357-358.

"Neyman-Pearson Lemma" by Zaqrfv.

"Neyman-Pearson Lemma." Wikipedia, the Free Encyclopedia.

Original Problems and Solutions from The Actuary's Free Study Guide

Problem S3L54-1. Similar to Question 3 from the Casualty Actuarial Society's Fall 2006 Exam 3. A single-parameter Pareto distribution with parameter α models financial loss amounts of 200 or greater. You have a random sample of losses modeled by this distribution: 430, 316, 766, 453, 243, 424, 224, 535. Use the method of moments to estimate α for this sample.

Solution S3L54-1. If a Pareto distribution only models amounts of 200 or greater, this means that θ = 200 for this distribution. For a single-parameter Pareto distribution, we know that -∞∫y*fY(y)dy = θ∫y*fY(y)dy = E[Y] = θα/(α - 1), which in our case is

200α/(α - 1). We are also given n = 8.

We use the method of moments, which in our case only requires a single equation:

-∞∫y*fY(y; θ1, θ2, ... ,θs)dy = (1/n)i=1n∑yi

θ∫y*fY(y)dy = (1/8)i=18∑yi

200α/(α - 1) = (1/8)(430+316+766+453+243+424+224+535)

200α/(α - 1)= 423.875

200α = 423.875α - 423.875

423.875 = 223.875α

α = about 1.893355667.

Problem S3L54-2. Similar to Question 5 from the Casualty Actuarial Society's Fall 2007 Exam 3.

Now you are given a typical two-parameter Pareto distribution with parameters θ and α. A random sample from this distribution contains the following values: 2000, 13515, 400000, 16437, 4993, 12000.

Find the parameter α for this distribution.

(Hint: The first and second moments for this distribution are given in Section 51).

Solution S3L54-2. Here, we are solving for two unknown parameters, so we will need a system of two equations. Let M be the random variable representing the values in our sample and X be the random variable representing the values given by the distribution we are considering. According to the method of moments, the system of equations that can be set up to solve for θ and α can be conceived of as follows:

E[X] = E[M].

E[X2] = E[M2].

For a two-parameter Pareto distribution,

E[X] =θ/(α - 1) and E[X2] = 2θ2/[(α - 1)(α - 2)].

We can compute E[M] = (2000 + 13515 + 400000 + 16437 + 4993 + 12000)/6 = 74824.1666667

We can compute E[M2] = (20002 + 135152 + 4000002 + 164372 + 49932 + 120002)/6 =

26770960040.5

Thus, our system of equations appears as follows.

(i) θ/(α - 1) = 74824.1666667

(ii) 2θ2/[(α - 1)(α - 2)] = 26770960040.5.

We divide (ii) by (i) to get

(iii) 2θ/(α - 2) = 357784.9408.

We transform (i) and (iii):

(i)': θ = 74824.1666667(α - 1)

(iii)': θ = 178892.4704(α - 2)

Thus, we have the following equality:

74824.1666667(α - 1) = 178892.4704(α - 2)

α - 1 = 2.390838126(α - 2)

α - 1 = 2.390838126α - 4.781676252

3.781676252 = 1.390838126α

α = about 2.718990931.

Problem S3L54-3. Similar to Question 4 from the Casualty Actuarial Society's Fall 2006 Exam 3. The probability distribution of the random variable N is dependent on the value of parameter θ, which can be either θ0 or θ1. The probability distributions that result are illustrated as follows:

n......f(n; θ0).... f(n; θ1)

3.......0.3..........0.12

4......0.03.........0.135

5......0.312........0.123

6......0.158........0.422

7......0.12.........0.07

8......0.08.........0.13

You are testing the following two hypotheses using the Neyman-Pearson Lemma and a single observation: H0: θ = θ0 and H1: θ = θ1.

Which of these points is in the critical region defined by the significance level of 10%? More than one correct answer is possible.

(a) n is 3

(b) n is 4

(c) n is 5

(d) n is 6

(e) n is 7

(f) n is 8

Solution S3L54-3. We use the fact that a point is within the critical region defined by a given significance level k if P(N = n │ θ = θ0) < k. Here, we need to find all points n for which P(N = n │ θ = θ0) < 0.1.

(a) n is 3 → P(N = 3 │ θ = θ0) = 0.3 > 0.1, so (a) is not in the critical region.

(b) n is 4 → P(N = 4 │ θ = θ0) = 0.03 < 0.1, so (b) is in the critical region.

(c) n is 5 → P(N = 5 │ θ = θ0) = 0.312 > 0.1, so (c) is not in the critical region.

(d) n is 6 → P(N = 6 │ θ = θ0) = 0.158 > 0.1, so (d) is not in the critical region.

(e) n is 7 → P(N = 7 │ θ = θ0) = 0.12 > 0.1, so (e) is not in the critical region.

(f) n is 8 → P(N = 8 │ θ = θ0) = 0.08 < 0.1, so (f) is in the critical region.

Thus, (b) and (f) are the correct answers.

Problem S3L54-4. Similar to Question 6 from the Casualty Actuarial Society's Fall 2007 Exam 3. You are using two methods to estimate the parameter θ for an exponential distribution. The estimate θa is the maximum likelihood estimate. The estimate θb is obtained via the method of moments. Assume you have a sample of size n, and let the sample mean be equal to θs. What is the value of θa - θb, expressed using solely numbers and/or the values θs and n?

Solution S3L54-4. We find, θb by using the fact that the method of moments system of equations in this case will be the single equation E[X] = E[M].We are given the sample mean E[M] = θs, and we know that the parameter θ is the mean of an exponential distribution. So E[X] from the method of moments is just θb, and therefore, θb = θs.

Now we find θa = max(L(θ)), where L(θ) = i=1nΠ fY(yi; θ).

For every yi under an exponential distribution, fY(yi; θ) = (1/θ)exp[-yi/θ].

Thus, L(θ) = i=1nΠ fY(yi; θ) = (1/θ)nexp[-i=1n∑yi/θ]

To find the maximum of its function, we can set its derivative equal to zero.

However, the natural logarithm of this function will also have a derivative of zero wherever the original function has a derivative of zero, and the natural logarithm function is easier to work with.

We find N(θ) = ln(L(θ)) = ln[(1/θ)nexp[-i=1n∑yi/θ]] = ln(1/θ)n + ln(exp[-i=1n∑yi/θ]]) =

n*ln(1/θ) + -i=1n∑yi/θ.

We also note that θs, the sample mean, is equal to (1/n) i=1n∑yi, and soi=1n∑yi = nθs.

Thus, N(θ) = n*ln(1/θ) + nθs/θ.

We take N'(θ) = n/θ - nθs2. We then set N'(θa) = 0.

Thus, 0 = n/θa - nθsa2.

n/θa = nθsa2.

a2a = nθs.

θa = θs.

We have shown that for any exponential distribution and any sample of any size n,

θa = θb = θs. That is, both the maximum likelihood estimate and the estimate using the method of moments will be equal to the sample mean. Therefore, it is always the case that θa - θb = 0.

Problem S3L54-5. Similar to Question 7 from the Casualty Actuarial Society's Fall 2007 Exam 3. The random variable Y is a sum of random variables X1 through Xn, each of which follows a Poisson distribution with mean Λ. Here, the sample size n is equal to 20. You are performing a test between two hypotheses: H0: Λ = 0.7 and H1: Λ < 0.7. The critical region for this test for rejecting H0 is Y ≤ 2. What is the significance level of the this hypothesis test? Hint: The sum of n Poisson variables, each with mean Λ, has mean nΛ. Hint II: The answer will be very small!

Solution S3L54-5. We recall that Broverman defines a significance levelfor a hypothesis test as "the probability of rejecting the hypothesis H0, given that H0 is true."

This is the probability Pr(Y ≤ 2 │ H0 is true.) = Pr(Y ≤ 2 │ Λ = 0.7).

If H0 is true, then Y is the sum of 20 Poisson random variables, each with mean Λ = 0.7.

Thus, Y has a mean of λ = 0.7*20 = 14.

Thus, we want to find Pr(Y ≤ 2 │λ = 14) = Pr(Y = 0, Y = 1, or Y = 2│λ = 14) =

e + λe + (1/2)λ2e = e-14 + 14e-14 + (1/2)142e-14 = about 0.00009396274526.

See other sections of The Actuary's Free Study Guide for Exam 3L.

Published by G. Stolyarov II

G. Stolyarov II is a science fiction novelist, independent essayist, poet, amateur mathematician, composer, author, and actuary.  View profile

To comment, please sign in to your Yahoo! account, or sign up for a new account.