Order Statistics, Likelihood Functions, and Maximum Likelihood Estimates: Practice Problems and Solutions

The Actuary's Free Study Guide for Exam 3L - Section 53

G. Stolyarov II
This section of sample problems and solutions is a part of The Actuary's Free Study Guide for Exam 3L, authored by Mr. Stolyarov. This is Section 53 of the Study Guide. See an index of all sections by following the link in this paragraph.

Following Larsen and Marx 2006, we let Y be a continuous random variable for which y'1, y'2, ..., y'n are values of a random sample of size n such that y'1< y'2< ... < y'n.

We can define the variable Y'i, the ith order statistic of Y, as Y'i = y'i, where 1 ≤ i ≤ n.

The ith order statistic of Y has the following probability density function (p. d. f.):

fY'_i(y) = (n!/[(i -1)!(n - i)!])*F(y)i-1*(1-F(y))n-i*f(y), where f(y) is the p. d. f. of Y and F(y) is the cumulative distribution function (c. d. f.) of Y.

We can also find E[Y'i] = 0∫y*fY'_i(y)*dy = 0∫sY'_i(y)*dy.

Then E[Y'i] can be used to estimate some particular value Yk of Y.

If this is the case, then the bias of the ith order statistic can be determined as follows:

Bias(Y'i) = E[Y'i] - Yk.

The following integration rule offers a useful shortcut for problems involving integration by parts:

0∫tne-at*dt = n!/an+1.

The following definition of the likelihood function is offered by Larsen and Marx.

"Let k1, k2,..., kn be a random sample of size n from the discrete p. d. f. pX(k; θ), where θ is an unknown parameter. The likelihood function, L(θ), is the product of the p. d. f. evaluated at the n kis. That is, L(θ) = i=1nΠ pX(k i; θ)."

If the p. d. f. in question is equal to fY(y i; θ) and is continuous for random sample y1, y2,..., yn of size n, then L(θ) = i=1nΠ fY(y i; θ).

Now we can define the maximum likelihood estimate, following Larsen and Marx.

"Let L(θ) = i=1nΠ pX(k i; θ)and L(θ) = i=1nΠ fY(y i; θ) be the likelihood functions corresponding to random samples k1, k2,..., kn and y1, y2,..., yn drawn from the discrete p.d.f. pX(k i; θ) and the continuous p.d.f. fY(y i; θ), respectively, where θ is an unknown parameter. In each case, let θe be a value of the parameter such that L(θe) ≥ L(θ) for all possible values of θ. Then θe is called a maximum likelihood estimate for θ."

Some of the problems in this section were designed to be similar to problems from past versions of the Casualty Actuarial Society's Exam 3L and the Society of Actuaries' Exam MLC. They use original exam questions as their inspiration - and the specific inspiration for each problem is cited so as to give students a chance to see the original. All of the original problems are publicly available, and students are encouraged to refer to them. But all of the values, names, conditions, and calculations in the problems here are the original work of Mr. Stolyarov.

Sources: Broverman, Sam. Actuarial Exam Solutions - CAS Exam 3 - Fall 2006.

Larsen, Richard J. and Morris L. Marx. An Introduction to Mathematical Statistics and Its Applications. Fourth Edition. Pearson Prentice Hall: 2006. pp. 241-243, 347.

Original Problems and Solutions from The Actuary's Free Study Guide

Problem S3L53-1.

Similar to part of Question 1 from the Casualty Actuarial Society's Fall 2006 Exam 3.

The 2nd order statistic in a sample of 4 is used to estimate the 40th percentile of an exponential distribution with mean θ. What is the actual 40th percentile of this distribution, in terms of θ?

Solution S3L53-1. The 40th percentile of an exponential distribution is the value y for which F(y) = 0.4 and s(y) = 1 - F(y) = 0.6. For an exponential distribution, s(y) = e-y/θ.

Thus, for the 40th percentile, 0.6 = e-y/θ and ln(0.6) = -y/θ, so y = -ln(0.6)*θ = ln(5/3)θ.

So the actual 40th percentile is ln(5/3)θ.

Problem S3L53-2.

Similar to part of Question 1 from the Casualty Actuarial Society's Fall 2006 Exam 3.

The 2nd order statistic in a sample of 4 is used to estimate the 40th percentile of an exponential distribution with mean θ. What is the estimated 40th percentile of this distribution, in terms of θ, given by using the 2nd order statistic?

Solution S3L53-2. We need to find E[Y'2] = 0∫y*fY'_2(y)*dy.

We find fY'_2(y) using the formula fY'_i(y) = (n!/[(i -1)!(n - i)!])*F(y)i-1*(1-F(y))n-i*f(y).

Here, n = 4, i = 2, f(y) = (1/θ)e-y/θ, and F(y) = 1 - e-y/θ.

Thus,

fY'_2(y) = [4!/(1!*2!)](1 - e-y/θ)(e-y/θ)2*(1/θ)e-y/θ

fY'_2(y) = 12*(1/θ)(1 - e-y/θ)(e-y/θ)3

fY'_2(y) = 12*(1/θ)(e-3y/θ - e-4y/θ)

Thus, E[Y'2] = 0∫y*12*(1/θ)(e-3y/θ - e-4y/θ)dy

E[Y'2] = (12/θ)0∫y*(e-3y/θ - e-4y/θ)dy

E[Y'2] = (12/θ)(0∫y*e-3y/θdy - 0∫y*e-4y/θdy)

Now we can use the integration shortcut 0∫tne-at*dt = n!/an+1to find

0∫y*e-3y/θdy = 1/(3/θ)2 and 0∫y*e-4y/θdy = 1/(4/θ)2.

Thus, E[Y'2] = (12/θ)(1/(3/θ)2 - 1/(4/θ)2)

E[Y'2] = (12/θ)(θ2/9 - θ2/16) = 12θ(1/9 - 1/16) = E[Y'2] = 7θ/12.

Problem S3L53-3.

Similar to part of Question 1 from the Casualty Actuarial Society's Fall 2006 Exam 3.

The 2nd order statistic in a sample of 4 is used to estimate the 40th percentile of an exponential distribution with mean θ. What is the bias resulting from estimating the 40th percentile of this distribution by using the 2nd order statistic? Give your answer in terms of θ.

Solution S3L53-3. We use the formula Bias(Y'i) = E[Y'i] - Yk.

From Solutions S3L53-1 and S3L53-2, we know that E[Y'2] = 7θ/12 and Yk = ln(5/3)θ.

Thus, Bias(Y'2) = 7θ/12 - ln(5/3)θ = about 0.0725077096θ.

Problem S3L53-4.

Similar to part of Question 2 from the Casualty Actuarial Society's Fall 2006 Exam 3.

The amount of time in hours it takes Hernando to write a property deed follows the cumulative distribution function F(x) = xθ+5, where 0 ≤ x ≤ 1 and θ > -5 is an unknown parameter. You are aware of the following sample of six times during each of which Hernando completed one property deed:

0.24, 0.643, 0.46, 0.45, 0.34, 0.36

Find the likelihood function L(θ) based on this sample.

Solution S3L53-4. We use the formula L(θ) = i=1nΠ fY(y i; θ).

We find fX(x) = F'(x) = (θ +5)xθ+5.

Thus,

L(θ) = (θ +5)0.24θ+5*(θ +5)0.643θ+5*(θ +5)0.46θ+5*(θ +5)0.45θ+5*(θ +5)0.34θ+5* (θ +5)0.36θ+5

L(θ) = (θ +5)6(0.24*0.643*0.46*0.45*0.34*0.36)θ+5

L(θ) = (θ +5)6(0.003909975)θ+5.

Problem S3L53-5. Similar to part of Question 2 from the Casualty Actuarial Society's Fall 2006 Exam 3.

The amount of time in hours it takes Hernando to write a property deed follows the cumulative distribution function F(x) = xθ+5, where 0 ≤ x ≤ 1 and θ > -5 is an unknown parameter. You are aware of the following sample of six times during each of which Hernando completed one property deed:

0.24, 0.643, 0.46, 0.45, 0.34, 0.36

Find the maximum likelihood estimate for θ using this sample.

Solution S3L53-5. We know from Solution S3L53-4 that L(θ) = (θ +5)6(0.003909975)θ+5.

The maximum of L(θ) will occur where L'(θ) = 0.

We find L'(θ) = 6(θ +5)5(0.003909975)θ+5 + (θ +5)6*ln(0.003909975)(0.003909975)θ+5

We set 0 = 6(θ +5)5(0.003909975)θ+5 + (θ +5)6*ln(0.003909975)(0.003909975)θ+5 and find θ.

6(θ +5)5(0.003909975)θ+5 = -(θ +5)6*ln(0.003909975)(0.003909975)θ+5

6(θ +5)5= -(θ +5)6*ln(0.003909975)

6 = -(θ +5)*ln(0.003909975)

θ +5 = 6/-ln(0.003909975)

θ = 6/-ln(0.003909975) - 5 = θ = about -3.917792703.

See other sections of The Actuary's Free Study Guide for Exam 3L.

Published by G. Stolyarov II

G. Stolyarov II is a science fiction novelist, independent essayist, poet, amateur mathematician, composer, author, and actuary.  View profile

1 Comments

Post a Comment
  • Anonymous10/28/2009

    Suppose x1, x2,...,xn are independent observations of X with density f(x;θ)=θxθ-1 for 0

To comment, please sign in to your Yahoo! account, or sign up for a new account.