3. Expected Values

1 Motivation

To describe an experiment (a random experiment), we oftenly use measures that allow us to summarize the available data. The “mean"   is one of that measures, since it helps us to localize the center of the distribution. For example:

Example 1.1 Students that enroll the postgraduate in Finance have to complete 4 curricular units 2 of them with 10 ECTS and the remaining 2 with 5 ECTS. João has completed the postgraduate in Finance with the following marks:

  • 15 - course with 10 ECTS

  • 13 - course with 10 ECTS

  • 16 - course with 5 ECTS

  • 14 - course with 5 ECTS

The average final grade of João was 1414, since 15×10+13×10+16×5+14×530=14.(3)15×10+13×10+16×5+14×530=14.(3)

2 The Expected Value of A Random Variable

2.1 Discrete Random Variables

Let XX be a a discrete random variable and let DXDX be the set of discontinuity points of the cumulative distribution function X.X. For generality, let us assume that the number of elements of DXDX is countably infinite, that this DX={x1,x2,...}.DX={x1,x2,...}.

The probability function of XX is given by fX(x)={P(X=x),xDX0,xDXfX(x)={P(X=x),xDX0,xDX

Expected Value of a discrete random variable: The expected value of a random variable, denoted as E(X)E(X) or μXμX, also known as its population mean, is the weighted average of its possible values, the weights being the probabilities attached to the values μX=E(X)=xDXx×fX(x)=i=1xi×fX(xi).μX=E(X)=xDXx×fX(x)=i=1xi×fX(xi). provided that xDX|x|×fX(x)=i=1|xi|×fX(xi)<+.xDX|x|×fX(x)=i=1|xi|×fX(xi)<+.

Example 2.1 Let XX be a discrete random variable such that X={1,if there is a success0,otherwise and P(X=x)={p,if x=11p,if x=0.X={1,if there is a success0,otherwise and P(X=x)={p,if x=11p,if x=0. The expected value of XX is given by E(X)=p×1+(1p)×0=pE(X)=p×1+(1p)×0=p

Remarks:

  • If the number of elements of DxDx is finite that this where kk is a finite integer, then xDX|x|×fX(x)=ki=1|xi|×fX(xi)xDX|x|×fX(x)=ki=1|xi|×fX(xi) and the condition ki=1|xi|×fX(xi)<+ is always satisfied.

  • Note that μX can take values that are not in DX.

Example: Let X be the random variable with the probability function P(X=2n)=2n,for all nN. This means that P(X=x)=0 for all x{2n:nN}. One can notice that the probability function satisfies the conditions:

  • P(X=x)0 for all xR;

  • xP(X=x)=n=12n=1/211/2

Additionally, one may easily notice that, the distribution does not have expected value: E(X)=n=12n×2n=

2.2 Continuous Random Variables

Expected values (or mean or expectation) of a continuous random variable: If X is a continuous random variable and fX(x) is its probability density function at x, the expected value of X is μX=E(X)=+xfX(x)dx provided that +|x|fX(x)dx<.

Remark: Thus, the mean can be thought of as the centre of the distribution and, as such, it describes its location. Consequently, the mean is considered as a measure of location.

Example 2.2 Let X be a random variable with probability density function given by fX(x)={1ba,a<x<b0,otherwise.

Then, the expected value of X is E(X)=+xfX(x)dx=baxbadx=b+a2

Expected value of a function of a discrete random variable: If X is a discrete random variable and fX(x) is the value of its probability function at x, the expected value of  g(X) is E[g(X)]=xDXg(x)×fX(x)=i=1g(xi)×fX(xi). provided that xDX|g(x)|×fX(x)=i=1|g(xi)|×fX(xi)<+.

Expected value of a function of a continuous random variable: If X is a continuous random variable and fX(x) is its probability density function at x, the expected value of g(X) is  E(g(X))=+g(x)f(x)dx. provided that +|g(x)|fX(x)dx<.

Remark:

  • The existence of E(X) does not imply the existence of E(g(X)) and vice versa.

3 The Mean of A Function of A Random Variable

  • E(g(X)) can be calculated using the above definition or finding the distribution of Y=g(X) and computing directly E(Y).

Example: Let X be a discrete random variable with probability function given by fX(x)=1/3, x=1, 0, 1 and Y=g(X)=X2. We can compute E(X2) if the following two ways:

  • By using the definition of expected value of a function of a random variable E(X2)=(1)2fX(1)+(0)2fX(0)+(1)2fX(1)=1×1/3+0×1/3+1×1/3=2/3.

  • Finding the distribution of Y and afterwards computing the expected value of Y. fY(0)=P(Y=0)=P(X=0)=1/3fY(1)=P(Y=1)=P(X=1)+P(X=1)=1/3+1/3=2/3.

    Therefore, fY(y)={1/3,y=02/3,y=10,otherwise. Thus,

Example: Let X be a random variable such that fX(x)={1ba,a<x<b0,otherwise. Then, the expected value of Y=2X is

E(2X)=+2xfX(x)dx=ba2xbadx=2baxbadx=b+a.

A different approach to calculate E(Y) is to derive the distribution of Y and afterwards to compute E(Y).

Indeed, FY(y)=P(Yy)=P(2Xy)=P(Xy/2)=FX(y/2).

The density function can be obtaining differentiating the cdf FY: fY(y)=FY(y)=(FX(y/2))=12FX(y/2)=12fX(y/2)={12(ba),2a<y<2b0,otherwise

Therefore,

E(Y)=+yfY(y)dy=2b2ay2(ba)dy=b+a

4 Properties of The Expected Value

The expected values satisfy the following properties:

  • E(a+bX)=a+bE(X), where a and b are constants.

  • E(XμX)=E(X)μX=0.

  • If a is a constant, E(a)=a.

  • If b is a constant, E(b×g(X))=bE(g(X)).

  • Given n functions ui(X) i=1,...,n and , E

5 Higher Order Raw Moments and Centered Moments

Moments of a discrete random variable: The rth moment of a discrete random variable (or its distribution), denoted as μr, is the expected value of Xr μr=E(Xr)=xDXxr×fX(x)=i=1xri×fX(xi), for r=1,2,... provided that xDX|x|r×fX(x)=i=1|xi|r×fX(xi)<+.

Moments of a continuous random variable: The rth moment of a continuous random variable (or its distribution), denoted as μr, is the expected value of Xr: μr=E(Xr)=+xrfX(x)dx provided that +|x|rfX(x)dx<.

rth central moment of the random variable or rth moment of a random variable about its mean

Central moments of a discrete random variable: The rth central moment of a discrete random variable (or its distribution), denoted as μr, is the expected value of (XμX)r μr=E[(XμX)r]=xDX(xμX)r×fX(x)=i=1(xiμX)r×fX(xi), for r=1,2,... provided that xDX|xμX|r×fX(x)=i=1|xiμX|r×fX(xi)<+.

Central moments of a continuous random variable: The rth central moment of a continuous random variable (or its distribution), denoted as , is the expected value of (XμX)r: μr=E[(XμX)r]=+(xμX)rfX(x)dx provided that +|xμX|rfX(x)dx<.

Example 5.1 Let X be a random variable such that fX(x)=1ba if a<x<b, and fX(x)=0 if x(a,b).

The 2nd moment is +x2fX(x)dx=bax2badx=13(a2+ab+b2).

The 2nd central moment is +(xμX)2fX(x)dx=ba(xμX)2badx=(ba)212.

  1. μ1 is of no interest because is it zero when it exists.

  2. μ2 is an important measure and is called variance.

  3. μ3 and μ4 are also important.

5.1 The Variance of a Random Variable

Variance: the second central moment about the mean of a random variable (μ2), also called variance, is an indicator of the dispersion of the values of X about the mean.

The variance of a discrete random variable: Var(X)=σ2X=μ2=E[(XμX)2]=xDX(xμX)2×fX(x), provided that Var(X)<+.

The variance of a continuous random variable: Var(X)=σ2X=μ2=E[(XμX)2]=+(xμX)2fX(x)dx,  provided that Var(X)<+.

Remark: We can show that if μ2=E(X2) exists, then both μX and σ2X exist.

Properties of the Variance:

  • Var(X)0.

  • σ2X=Var(X)=E(X2)μ2X.

  • If c is a constant, Var(c)=0.

  • If a and b are constants, Var(a+bX)=b2Var(X).

Example 5.2 Let X be a discrete random variable such that X={1,if there is a success0,otherwise and P(X=x)={p,if x=11p,if x=0. The expected value of X is given by E(X)=p×1+(1p)×0=pE(X2)=p×12+(1p)×02=pVar(X)=E(X2)(E(X))2=p(1p)

Example 5.3 Let X be a continuous random variable such that fX(x)={2x,0<x<10,otherwise. Then, E(X)=+xfX(x)dx=2/3 and E(X2)=+x2fX(x)dx=1/2. Therefore, Var(X)=E(X2)(E(X))2=118

Standard deviation: The variance is not measured in the scale of the random variable as it is computed using the square function, in order to obtain a measure of dispersion about the mean which is measure in the same scale of the random variable we need to compute the standard deviation. The Standard deviation is given by: σX=Var(X).

5.2 Coefficient of Variation, Skewness and Kurtosis

Other Population Distribution Summary Statistics are:

Coefficient of variation: If we are interested in a measure of dispersion which is independent of the scale of the random variable we should use the coefficient of variation. The coefficient of variation is given by CV(X)=σXμX.

Example 5.4 Let X be a discrete random variable such that X={1,if there is a success0,otherwise and P(X=x)={p,if x=11p,if x=0. The expected value of X is given by E(X)=p×1+(1p)×0=pE(X2)=p×12+(1p)×02=pVar(X)=E(X2)(E(X))2=p(1p) Additionally, σX=p(1p)andCV(X)=1pp

Example 5.5 Let X be a continuous random variable such that fX(x)={2x,0<x<10,otherwise. We have already computed E(X)=2/3andVar(X)=118. Therefore, σX=118=132 and CV(X)=σXμX=122.

Skewness:

Beyond the location and dispersion it is desirable to know the distribution behaviour about the mean. One parameter of interest is the coefficient of asymmetry also known as skewness. This parameter is a measure of asymmetry of a probability function/density about the mean of the random variable. It is given by γ1=E[(XμX)3]Var(X)3/2=μ3σ3X

Remarks:

  • For discrete random variables a probability function is symmetric if for all δR.

  • For continuous random variables the probability density function is symmetric if fX(μxδ)=fX(μx+δ) for all δR

Example 5.6 Let X be a discrete random variable with probability function given by fX(x)={0.25, for x=10.5, for x=00.25,for x=1.

Note that μX=E(X)=(1)×0.25+(0)×0.5+1×0.25=0 and E(X3)=(1)3×0.25+(0)3×0.5+13×0.25.=0. Therefore, fX(x) is symmetric about μX=0 and γ1=0.

Remark: Note however that we can have γ1=0, and the probability function/density is not symmetric about the mean, that is γ1=0 does not imply symmetry.

Example 5.7 Let X be a discrete random variable with probability function given by fX(x)={0.1, for x=30.5, for x=10.4,for x=2.

Note that μX=E(X)=(3)×0.1+(1)×0.5+2×0.4=0. Since fX(1)fX(1), this function is not symmetric around μX=0. However E(X3)=(3)3×0.1+(1)3×0.5+23×0.4.=0, and consequently γ1=0.

Kurtosis:

The kurtosis measures the “thickness” of the "tails" of the probability function/density or, equivalently, the “flattening” of the probability function/density in the central zone of the distribution.

γ2=E[(XμX)4]Var(X)2=μ4σ4X.

The kurtosis of the normal distribution is 3. We use this distribution as reference, therefore we can define the excess of kurtosis as γ2=μ4σ4X3

6 Quantiles and Mode of A Distribution

Quantiles: Other parameters of interest are the quantiles of a (cumulative) distribution or quantiles of a random variable. Quantiles have the advantage that they exist even for random variables that do not have moments.

Definition: Let be X be random variable and α(0,1). The quantile of order α, qα is the smallest value among all points x in R that satisfy the condition FX(x)α.

Remarks:

  • If X is a discrete random variable qαDX.

  • The quantile 0.5 is called the median of a (cumulative) the distribution function. It can also be interpreted as a centre of the distribution and therefore it is also considered a measure of location.

  • When the probability function/ density is symmetric the median=mean.

Example: Let X be a random variable such that FX(x)={0,x<01ex,xx0 What is the quantile of order 0.4?

Solution:We can start by solving FX(x)=0.4x=ln(0.6)0.5108

Let X be a discrete random variable such that X={1,if there is a success0,otherwise and P(X=x)={p,if x=11p,if x=0. It follows that FX(x)={0, for x<01pfor 0x<11, for x1

  • Compute the quantile of order 0.5 when p=0.2.

  • Compute the quantile of order 0.5 when p=0.6.

Fix p=0.2:

q0.5 which is the smallest value among all points x in R that satisfy the condition FX(x)0.5.

Therefore, q0.5=0.

Fix p=0.6:

q0.5 which is the smallest value among all points x in R that satisfy the condition FX(x)0.5.

Therefore, q0.5=1.

Remarks:

  • The qα are called quartiles if α=0.25, 0.5, Therefore the first quartile is q0.25, the second quartile is and the third quartile is q0.75

  • The qα are called deciles if α=0.1, 0.2,...,0.9 . Therefore the first decile is q0.1, the second decile is q0.2, etc..

  • The qα are called percentiles if α= 0.01, 0.02 ,…,0.99. Therefore the first percentile is q0.01, the second percentile is q0.02, etc.

  • The interquartile range IQR=q0.75q0.25 is considered a measure of dispersion.

The mode: The mode of a random variable X or distribution is the value mo(X) that satisfies the condition fX(mo(X))fX(x), for all xR, where fX(x) is the probability function in the case of discrete random variables and it is the probability density function in the case of continuous random variables.

Remarks:

  1. The mode can also be interpreted as a centre of the distribution and therefore it is also considered a measure of location.

  2. In the case of discrete random variable the mode is the most frequent value.

  3. The mode does not have to be unique.

  4. If the variable probability distribution/density is symmetric and has only one mode, then the mode equals the median and the mean.

Example 6.1 Let X be a discrete random variable such that X={1,if there is a success0,otherwise and P(X=x)={p,if x=11p,if x=0.

Fix p=0.2:

It follows that mo(X)=argmaxxRP(X=x)=0.

Fix p=0.6:

It follows that mo(X)=argmaxxRP(X=x)=1.

Fix p=0.5:

It follows that mo(X)=argmaxxRP(X=x)=0 and 1.

In this case, there are two modes mo(X)=0 and

Example 6.2 Let X be a continuous random variable with density function fX(x)={x0<x12x1<x<2 Compute the mode.

Solution: Since fX(x)<1 for x(0,2){1} and fX(1)=1 the the mode is given by mo(X)=1 because x=1 maximizes the function f.

Example 6.3 Let X be a continuous random variable with density function fX(x)={2x,0<x<10,otherwise There is no mode because the density function does not have a maximum.

7 Moment Generating Functions

The moment generating function: The moment generating function is an important function in probability because it defines uniquely the distribution function (when the moment generating function is properly defined).

Definition: The moment generating function of a discrete random variable is given by MX(t)=E(etX)=xDXetx×fX(x)=i=1etxi×fX(xi),  provided that it is finite.

Definition: The moment generating function of a continuous random variable is given by MX(t)=E(etX)=+etxfX(x)dx.. provided that it is finite.

Remarks on the moment generating function (m.g.f.):

  • The m.g.f. may not exist.

  • If X is a discrete random variables and DX is finite, then there is always a m.g.f.;

  • The moment generating function is a function of t not X;

  • If there is a m.g.f., then there are moments of every order. The reverse is not true.

  • A distribution which has no moments – or has only the first k moments – does not have a m.g.f..

  • The moment generating function is used to calculate the moments.

  • The m.g.f. uniquely determines the distribution function. That is, if two random variables have the same m.g.f., then the cumulative distribution functions of the random variables coincide, except perhaps at a finite number of points.

Theorem: Let X be a random variable with moment generating function MX defined. Then, drMX(t)dtr|t=0=μr=E[Xr], r=1,2,3,...

Remark: This result allows us to compute the raw moment of X of order k by computing the kth derivative and evaluate it at the point 0.

Example: Let X be a random variable such that fX(x)={0.2,x=0,30.5,x=10.1,x=20,otherwise By definition, E(X)=3x=0xfX(x)=0×0.2+1×0.5+2×0.1+3×0.2=1.3.

By using the moment generating function, we have MX(t)=3x=0etxfX(x)=0.2(1+e3t)+0.5et+0.1e2t. Therefore, MX(t)=0.6e3t+0.5et+0.2e2t, and, consequently, E(X)=MX(0)=1.3.

Example 7.1 Let X be a discrete random variable such that X={1,if there is a success0,otherwise and P(X=x)={p,if x=11p,if x=0. The moment generating function MX(t) is given by MX(t)=E(etX)=(1p)e0×t+pet×1=(1p)+pet. The derivative of MX(t) in t is given by tMX(t)=pet. Therefore, E(X)=tMX(t)|t=0=p

Example: Let X be a continuous random variable with density function fX(x)={0, for x<0λeλxfor x0

MX(t)=E(etX)=+0λe(tλ)xdx=λlimzz0e(tλ)xdx=λlimz[e(tλ)xtλ]x=zx=0=λtλ,

provided that t<λ. Now dMX(t)dt=d(λtλ)dt=λ(tλ)2 and, consequently, E(X)=dMX(t)dt|t=0=1λ.

Proposition: Let X be a random variable such that MX is its moment generating function. For a,bR, MbX+a(t)=E[e(bX+a)t]=eatMX(bt).

Proposition: Let Xi, with i=1,,n be independent random variables such that its moment generating function is given by MXi. The moment generating function of a sum of independent random variables Sn=ni=1Xi equals the product of their m.g.f.(s). MSn(t)=MX1(t)×MX2(t)×...×MXn(t).

Previous
Next