User:Boris Tsirelson/Sandbox1

From Citizendium
< User:Boris Tsirelson
Revision as of 13:59, 29 June 2009 by imported>Boris Tsirelson (→‎See also)
Jump to navigation Jump to search

WP Conditioning (probability) 16:17, 18 November 2008

Beliefs depend on the available information. This idea is formalized in probability theory by conditioning. Conditional probabilities, conditional expectations and conditional distributions are treated on three levels: discrete probabilities, probability density functions, and measure theory. Conditioning leads to a non-random result if the condition is completely specified; otherwise, if the condition is left random, the result of conditioning is also random.

This article concentrates on interrelations between various kinds of conditioning, shown mostly by examples. For systematic treatment (and corresponding literature) see more specialized articles mentioned below.

Conditioning on the discrete level

Example. A fair coin is tossed 10 times; the random variable is the number of heads in these 10 tosses, and — the number of heads in the first 3 tosses. In spite of the fact that emerges before it may happen that someone knows but not .

Conditional probability

Given that the conditional probability of the event is More generally,

for otherwise (for ), One may also treat the conditional probability as a random variable, — a function of the random variable , namely,

The expectation of this random variable is equal to the (unconditional) probability,

namely,

which is an instance of the law of total probability

Thus, may be treated as the value of the random variable corresponding to On the other hand, is well-defined irrespective of other possible values of .

Conditional expectation

Given that the conditional expectation of the random variable is More generally,

for (In this example it appears to be a linear function, but in general it is nonlinear.) One may also treat the conditional expectation as a random variable, — a function of the random variable , namely,

The expectation of this random variable is equal to the (unconditional) expectation of ,

namely,

  or simply  

which is an instance of the law of total expectation

The random variable is the best predictor of given . That is, it minimizes the mean square error on the class of all random variables of the form This class of random variables remains intact if is replaced, say, with Thus, It does not mean that rather, In particular, More generally, for every function that is one-to-one on the set of all possible values of . The values of are irrelevant; what matters is the partition (denote it α)

of the sample space into disjoint sets (Here are all possible values of .) Given an arbitrary partition of , one may define the random variable Still,

Conditional probability may be treated as a special case of conditional expectation. Namely, if is the indicator of . Therefore the conditional probability also depends on the partition generated by rather than on itself;

On the other hand, conditioning on an event is well-defined, provided that irrespective of any partition that may contain as one of several parts.

Conditional distribution

Given the conditional distribution of is

for It is the hypergeometric distribution or equivalently, The corresponding expectation obtained from the general formula for is nothing but the conditional expectation

Treating as a random distribution (a random vector in the four-dimensional space of all measures on one may take its expectation, getting the unconditional distribution of , — the binomial distribution This fact amounts to the equality

for just the law of total probability.

Conditioning on the level of densities

Example. A point of the sphere is chosen at random according to the uniform distribution on the sphere. The random variables , , are the coordinates of the random point. The joint density of , , Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Z} does not exist (since the sphere is of zero volume), but the joint density Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_{X,Y}} of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} , Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} exists,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_{X,Y} (x,y) = \begin{cases} \frac1{2\pi\sqrt{1-x^2-y^2}} &\text{if } x^2+y^2<1,\\ 0 &\text{otherwise}. \end{cases} }

(The density is non-constant because of a non-constant angle between the sphere and the plane.) The density of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} may be calculated by integration,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_X(x) = \int_{-\infty}^{+\infty} f_{X,Y}(x,y) \, \mathrm{d}y = \int_{-\sqrt{1-x^2}}^{+\sqrt{1-x^2}} \frac{ \mathrm{d}y }{ 2\pi\sqrt{1-x^2-y^2} } \, ; }

surprisingly, the result does not depend on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} in (-1,1),

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_X(x) = \begin{cases} 0.5 &\text{for } -1<x<1,\\ 0 &\text{otherwise}, \end{cases} }

which means that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} is distributed uniformly on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (-1,1).} The same holds for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Z} (and in fact, for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle aX+bY+cZ} whenever Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle a^2+b^2+c^2=1).}

Conditional probability

Calculation

Given that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X=0.5,} the conditional probability of the event Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y\le0.75} is the integral of the conditional density,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} & f_{Y|X=0.5}(y) = \frac{ f_{X,Y}(0.5,y) }{ f_X(0.5) } = \begin{cases} \frac1{ \pi \sqrt{0.75-y^2} } &\text{for } -\sqrt{0.75}<y<\sqrt{0.75},\\ 0 &\text{otherwise}; \end{cases} \\ & \mathbb{P} (Y \le 0.75|X=0.5) = \int_{-\infty}^{0.75} f_{Y|X=0.5}(y) \, \mathrm{d}y = \\ & = \int_{-\sqrt{0.75}}^{0.75} \frac{ \mathrm{d}y }{ \pi \sqrt{0.75-y^2} } = \frac12 + \frac1{\pi} \arcsin \sqrt{0.75} = \frac56 \, . \end{align} }

More generally,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{P} (Y \le y|X=x) = \frac12 + \frac1{\pi} \arcsin \frac{ y }{ \sqrt{1-x^2} } }

for all Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y} such that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle -1<x<1} (otherwise the denominator Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_X(x)} vanishes) and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \textstyle -\sqrt{1-x^2} < y < \sqrt{1-x^2} } (otherwise the conditional probability degenerates to 0 or 1). One may also treat the conditional probability as a random variable, — a function of the random variable Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} , namely,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{P} (Y \le y|X) = \begin{cases} 0 &\text{for } X^2 \ge 1-y^2 \text{ and } y<0,\\ \frac12 + \frac1{\pi} \arcsin \frac{ y }{ \sqrt{1-X^2} } &\text{for } X^2 < 1-y^2,\\ 1 &\text{for } X^2 \ge 1-y^2 \text{ and } y>0. \end{cases} }

The expectation of this random variable is equal to the (unconditional) probability,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{E} ( \mathbb{P} (Y\le y|X) ) = \int_{-\infty}^{+\infty} \mathbb{P} (Y\le y|X=x) f_X(x) \, \mathrm{d}x = \mathbb{P} (Y\le y), }

which is an instance of the law of total probability Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(P(A|X))=P(A).}

Interpretation

The conditional probability Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(Y\le0.75|X=0.5)} cannot be interpreted as Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(Y\le0.75,X=0.5)/P(X=0.5),} since the latter gives 0/0. Accordingly, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(Y\le0.75|X=0.5)} cannot be interpreted via empirical frequencies, since the exact value Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X=0.5} has no chance to appear at random, not even once during an infinite sequence of independent trials.

The conditional probability can be interpreted as a limit,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} & \mathbb{P} (Y\le0.75 | X=0.5) = \lim_{\varepsilon\to0+} \mathbb{P} (Y\le0.75 | 0.5-\varepsilon<X<0.5+\varepsilon) = \\ & = \lim_{\varepsilon\to0+} \frac{ \mathbb{P} (Y\le0.75, 0.5-\varepsilon<X<0.5+\varepsilon) }{ \mathbb{P} (0.5-\varepsilon<X<0.5+\varepsilon) } = \\ & = \lim_{\varepsilon\to0+} \frac{ \int_{0.5-\varepsilon}^{0.5+\varepsilon} \mathrm{d}x \int_{-\infty}^{0.75} \mathrm{d}y \, f_{X,Y}(x,y) }{ \int_{0.5-\varepsilon}^{0.5+\varepsilon} \mathrm{d}x \, f_X(x) } \, . \end{align} }

Conditional expectation

The conditional expectation Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(Y|X=0.5)} is of little interest; it vanishes just by symmetry. It is more interesting to calculate Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(|Z||X=0.5)} treating |Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Z} | as a function of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} , Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} :

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} & |Z| = h(X,Y) = \sqrt{1-X^2-Y^2} \, ; \\ & \mathrm{E} ( |Z| | X=0.5 ) = \int_{-\infty}^{+\infty} h(0.5,y) f_{Y|X=0.5} (y) \, \mathrm{d} y = \\ & = \int_{-\sqrt{0.75}}^{+\sqrt{0.75}} \sqrt{0.75-y^2} \cdot \frac{ \mathrm{d}y }{ \pi \sqrt{0.75-y^2} } = \frac2\pi \sqrt{0.75} \, . \end{align} }

More generally,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{E} ( |Z| | X=x ) = \frac2\pi \sqrt{1-x^2} }

for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle -1<x<1.} One may also treat the conditional expectation as a random variable, — a function of the random variable X, namely,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{E} ( |Z| | X ) = \frac2\pi \sqrt{1-X^2} \, . }

The expectation of this random variable is equal to the (unconditional) expectation of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle |Z|,}

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{E} ( \mathbb{E} ( |Z| | X ) ) = \int_{-\infty}^{+\infty} \mathbb{E} ( |Z| | X=x ) f_X(x) \, \mathrm{d}x = \mathbb{E} (|Z|) \, , }

namely,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \int_{-1}^{+1} \frac2\pi \sqrt{1-x^2} \cdot \frac{ \mathrm{d}x }2 = \frac12 \, , }

which is an instance of the law of total expectation Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(E(Y|X))=E(Y).}

The random variable Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(|Z||X)} is the best predictor of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle |Z|} given Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} . That is, it minimizes the mean square error Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(|Z|-f(X))^2} on the class of all random variables of the form Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f(X).} Similarly to the discrete case, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(|Z||g(X))=E(|Z||X)} for every measurable function Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g} that is one-to-one on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (-1,1).}

Conditional distribution

Given Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X=x,} the conditional distribution of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} , given by the density Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_{Y|X=x}(y),} is the (rescaled) arcsin distribution; its cumulative distribution function is

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle F_{Y|X=x} (y) = \mathbb{P} ( Y \le y | X = x ) = \frac12 + \frac1\pi \arcsin \frac{y}{\sqrt{1-x^2}} }

for all Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y} such that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x^2+y^2<1.} The corresponding expectation of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle h(x,Y)} is nothing but the conditional expectation Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(h(X,Y)|X=x).} The mixture of these conditional distributions, taken for all Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} (according to the distribution of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} ) is the unconditional distribution of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} . This fact amounts to the equalities

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} & \int_{-\infty}^{+\infty} f_{Y|X=x} (y) f_X(x) \, \mathrm{d}x = f_Y(y) \, , \\ & \int_{-\infty}^{+\infty} F_{Y|X=x} (y) f_X(x) \, \mathrm{d}x = F_Y(y) \, , \end{align} }

the latter being the instance of the law of total probability mentioned above.

What conditioning is not

On the discrete level conditioning is possible only if the condition is of nonzero probability (one cannot divide by zero). On the level of densities, conditioning on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X=x} is possible even though Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(X=x)=0.} This success may create the illusion that conditioning is always possible. Regretfully, it is not, for several reasons presented below.

Geometric intuition: caution

The result Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(Y\le0.75|X=0.5)=5/6,} mentioned above, is geometrically evident in the following sense. The points Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (x,y,z)} of the sphere Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x^2+y^2+z^2=1,} satisfying the condition Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x=0.5,} are a circle Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y^2+z^2=0.75} of radius Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sqrt{0.75} } on the plane Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x=0.5.} The inequality Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y\le0.75} holds on an arc. The length of the arc is 5/6 of the length of the circle, which is why the conditional probability is equal to 5/6.

This successful geometric explanation may create the illusion that the following question is trivial.

A point of a given sphere is chosen at random (uniformly). Given that the point lies on a given plane, what is its conditional distribution?

It may seem evident that the conditional distribution must be uniform on the given circle (the intersection of the given sphere and the given plane). Sometimes it really is, but in general it is not. Especially, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Z} is distributed uniformly on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (-1,+1)} and independent of the ratio Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y/X,} thus, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(Z\le0.5|Y/X)=0.75.} On the other hand, the inequality Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle z\le0.5} holds on an arc of the circle Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x^2+y^2+z^2=1,} Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y=cx} (for any given Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle c} ). The length of the arc is 2/3 of the length of the circle. However, the conditional probability is 3/4, not 2/3. This is a manifestation of the classical Borel paradox[1] [2].

"Appeals to symmetry can be misleading if not formalized as invariance arguments." Pollard[3]

Another example. A random rotation of the three-dimensional space is a rotation by a random angle around a random axis. Geometric intuition suggests that the angle is independent of the axis and distributed uniformly. However, the latter is wrong; small values of the angle are less probable.

The limiting procedure

Given an event Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B} of zero probability, the formula Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \textstyle \mathbb{P} (A|B) = \mathbb{P} ( A \cap B ) / \mathbb{P} (B) } is useless, however, one can try Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \textstyle \mathbb{P} (A|B) = \lim_{n\to\infty} \mathbb{P} ( A \cap B_n ) / \mathbb{P} (B_n) } for an appropriate sequence of events Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_n} of nonzero probability such that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_n\downarrow B} (that is, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \textstyle B_1 \supset B_2 \supset \dots } and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \textstyle B_1 \cap B_2 \cap \dots = B } ). One example is given above. Two more examples are Brownian bridge and Brownian excursion.

In the latter two examples the law of total probability is irrelevant, since only a single event (the condition) is given. In contrast, in the example above the law of total probability applies, since the event Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X=0.5} is included into a family of events Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X=x} where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} runs over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (-1,1),} and these events are a partition of the probability space.

In order to avoid paradoxes (such as the Borel's paradox), the following important distinction should be taken into account. If a given event is of nonzero probability then conditioning on it is well-defined (irrespective of any other events), as was noted above. In contrast, if the given event is of zero probability then conditioning on it is ill-defined unless some additional input is provided. Wrong choice of this additional input leads to wrong conditional probabilities (expectations, distributions). In this sense, "the concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is inadmissible." (Kolmogorov; quoted in [3]).

The additional input may be (a) a symmetry (invariance group); (b) a sequence of events Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_n} such that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_n\downarrow B,} Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(B_n)>0;} (c) a partition containing the given event. Measure-theoretic conditioning (below) investigates Case (c), discloses its relation to (b) in general and to (a) when applicable.

Some events of zero probability are beyond the reach of conditioning. An example: let Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X_n} be independent random variables distributed uniformly on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (0,1),} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B} the event "Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X_n\to0} as Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n\to\infty} "; what about Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(X_n<0.5|B)?} Does it tend to 1, or not? Another example: let Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} be a random variable distributed uniformly on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (0,1),} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B} the event "Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} is a rational number"; what about Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(X=1/n|B)?} The only answer is that, once again, "the concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is inadmissible." Kolmogorov, quoted in [3]

Conditioning on the level of measure theory

Example. Let Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} be a random variable distributed uniformly on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (0,1),} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X=f(Y)} where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f} is a given function. Two cases are treated below: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f=f_1} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f=f_2,} where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_1} is the continuous piecewise-linear function

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_1(y) = \begin{cases} 3y &\text{for } 0 \le y \le 1/3,\\ 1.5(1-y) &\text{for } 1/3 \le y \le 2/3,\\ 0.5 &\text{for } 2/3 \le y \le 1, \end{cases} }

and is the everywhere continuous but nowhere differentiable Weierstrass function.

Geometric intuition: caution

Given Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X=0.75,} two values of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} are possible, 0.25 and 0.5. It may seem evident that both values are of conditional probability 0.5 just because one point is congruent to another point. However, this is an illusion; see below.

Conditional probability

The conditional probability Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(Y\le1/3|X)} may be defined as the best predictor of the indicator

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle I = \begin{cases} 1 &\text{if } Y \le 1/3,\\ 0 &\text{otherwise}, \end{cases} }

given X. That is, it minimizes the mean square error Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(I-g(X))^2} on the class of all random variables of the form Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g(X).}

In the case Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f=f_1} the corresponding function Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g=g_1} may be calculated explicitly,[4]

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g_1(x) = \begin{cases} 1 &\text{for } 0 < x < 0.5,\\ 0 &\text{for } x = 0.5,\\ 1/3 &\text{for } 0.5 < x < 1. \end{cases} }

Alternatively, the limiting procedure may be used,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g_1(x) = \lim_{\varepsilon\to0+} \mathbb{P} ( Y \le 1/3 | x-\varepsilon \le X \le x+\varepsilon ) \, , }

giving the same result.

Thus, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(Y\le1/3|X)=g_1(X).} The expectation of this random variable is equal to the (unconditional) probability, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(P(Y\le1/3|X))=P(Y\le1/3),} namely,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 1 \cdot \mathbb{P} (X<0.5) + 0 \cdot \mathbb{P} (X=0.5) + \frac13 \cdot \mathbb{P} (X>0.5) = 1 \cdot \frac16 + 0 \cdot \frac13 + \frac13 \cdot \Big( \frac16 + \frac13 \Big) = \frac13 \, , }

which is an instance of the law of total probability Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(P(A|X))=P(A).}

In the case Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f=f_2} the corresponding function Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g=g_2} probably cannot be calculated explicitly. Nevertheless it exists, and can be computed numerically. Indeed, the space Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle L_2(\Omega)} of all square integrable random variables is a Hilbert space; the indicator Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle I} is a vector of this space; and random variables of the form Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g(X)} are a (closed, linear) subspace. The orthogonal projection of this vector to this subspace is well-defined. It can be computed numerically, using finite-dimensional approximations to the infinite-dimensional Hilbert space.

Once again, the expectation of the random variable Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(Y\le1/3|X)=g_2(X)} is equal to the (unconditional) probability, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(P(Y\le1/3|X))=P(Y≤1/3),} namely,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \int_0^1 g_2 (f_2(y)) \, \mathrm{d}y = \frac13 \, . }

However, the Hilbert space approach treats Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g_2} as an equivalence class of functions rather than an individual function. Measurability of is ensured, but continuity (or even Riemann integrability) is not. The value Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g_2(0.5)} is determined uniquely, since the point 0.5 is an atom of the distribution of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} . Other values Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} are not atoms, thus, corresponding values Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g_2(x)} are not determined uniquely. Once again, "the concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is inadmissible." (Kolmogorov; quoted in [3]).

Alternatively, the same function Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g} (be it Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g_1} or Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g_2} ) may be defined as the Radon-Nikodym derivative

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g = \frac{ \mathrm{d}\nu }{ \mathrm{d}\mu } \, , }

where measures μ, ν are defined by

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \mu (B) &= \mathbb{P} ( X \in B ) \, , \\ \nu (B) &= \mathbb{P} ( X \in B, \, Y \le 1/3 ) \end{align} }

for all Borel sets Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B \subset \mathbb R. } That is, μ is the (unconditional) distribution of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} , while ν is one third of its conditional distribution,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \nu (B) = \mathbb{P} ( X \in B | Y \le 1/3 ) \mathbb{P} ( Y \le 1/3 ) = \frac13 \mathbb{P} ( X \in B | Y \le 1/3 ) \, . }

Both approaches (via the Hilbert space, and via the Radon-Nikodym derivative) treat Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g} as an equivalence class of functions; two functions Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g'} are treated as equivalent, if Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g(X)=g'(X)} almost surely. Accordingly, the conditional probability Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(Y\le1/3|X)} is treated as an equivalence class of random variables; as usual, two random variables are treated as equivalent if they are equal almost surely.

Conditional expectation

The conditional expectation Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(Y|X)} may be defined as the best predictor of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} given Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} . That is, it minimizes the mean square error Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(Y-h(X))^2} on the class of all random variables of the form Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle h(X).}

In the case Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f=f_1} the corresponding function Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle h=h_1} may be calculated explicitly,[5]

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle h_1(x) = \begin{cases} x/3 &\text{for } 0 < x < 0.5,\\ 5/6 &\text{for } x = 0.5,\\ (2-x)/3 &\text{for } 0.5 < x < 1. \end{cases} }

Alternatively, the limiting procedure may be used,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle h_1(x) = \lim_{\varepsilon\to0+} \mathbb{E} ( Y | x-\varepsilon \le X \le x+\varepsilon ) \, , }

giving the same result.

Thus, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(Y|X)=h_1(X).} The expectation of this random variable is equal to the (unconditional) expectation, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(E(Y|X))=E(Y),} namely,

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} & \int_0^1 h_1(f_1(y)) \, \mathrm{d}y = \int_0^{1/6} \frac{3y}3 \, \mathrm{d}y + \\ & \quad + \int_{1/6}^{1/3} \frac{2-3y}3 \, \mathrm{d}y + \int_{1/3}^{2/3} \frac{ 2 - 1.5(1-y) }{ 3 } \, \mathrm{d}y + \int_{2/3}^1 \frac56 \, \mathrm{d}y = \frac12 \, , \end{align} }

which is an instance of the law of total expectation Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(E(Y|X))=E(Y).}

In the case Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f=f_2} the corresponding function Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle h=h_2} probably cannot be calculated explicitly. Nevertheless it exists, and can be computed numerically in the same way as Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g_2} above, — as the orthogonal projection in the Hilbert space. The law of total expectation holds, since the projection cannot change the scalar product by the constant function 1 belonging to the subspace.

Alternatively, the same function Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle h} (be it or Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle h_2} ) may be defined as the Radon-Nikodym derivative

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle h = \frac{ \mathrm{d}\nu }{ \mathrm{d}\mu } \, , }

where measures Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu,\nu} are defined by

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \mu (B) &= \mathbb{P} ( X \in B ) \, , \\ \nu (B) &= \mathbb{E} ( Y, \, X \in B ) \end{align} }

for all Borel sets Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B \subset \mathbb R. } Here Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(Y;A)} is the restricted expectation, not to be confused with the conditional expectation Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(Y|A)=E(Y;A)/P(A).}

Conditional distribution

In the case Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f=f_1} the conditional cumulative distribution function may be calculated explicitly, similarly to Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g_1.} The limiting procedure gives

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} & F_{Y|X=0.75} (y) = \mathbb{P} ( Y \le y | X = 0.75 ) = \\ & = \lim_{\varepsilon\to0+} \mathbb{P} ( Y \le y | 0.75-\varepsilon \le X \le 0.75+\varepsilon ) = \\ & = \begin{cases} 0 &\text{for } -\infty < y < 1/4,\\ 1/6 &\text{for } y = 1/4,\\ 1/3 &\text{for } 1/4 < y < 1/2,\\ 2/3 &\text{for } y = 1/2,\\ 1 &\text{for } 1/2 < y < \infty, \end{cases} \end{align} }

which cannot be correct, since a cumulative distribution function must be right-continuous!

This paradoxical result is explained by measure theory as follows. For a given Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y} the corresponding Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle F_{Y|X=x}(y)=P(Y\le y|X=x)} is well-defined (via the Hilbert space or the Radon-Nikodym derivative) as an equivalence class of functions (of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} ). Treated as a function of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y} for a given Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} it is ill-defined unless some additional input is provided. Namely, a function (of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} ) must be chosen within every (or at least almost every) equivalence class. Wrong choice leads to wrong conditional cumulative distribution functions.

A right choice can be made as follows. First, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle F_{Y|X=x}(y)=P(Y\le y|X=x)} is considered for rational numbers Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y} only. (Any other dense countable set may be used equally well.) Thus, only a countable set of equivalence classes is used; all choices of functions within these classes are mutually equivalent, and the corresponding function of rational Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y} is well-defined (for almost every Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} ). Second, the function is extended from rational numbers to real numbers by right continuity.

In general the conditional distribution is defined for almost all Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} (according to the distribution of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} ), but sometimes the result is continuous in Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} , in which case individual values are acceptable. In the considered example this is the case; the correct result for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x=0.75,}

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} & F_{Y|X=0.75} (y) = \mathbb{P} ( Y \le y | X = 0.75 ) = \\ & = \begin{cases} 0 &\text{for } -\infty < y < 1/4,\\ 1/3 &\text{for } 1/4 \le y < 1/2,\\ 1 &\text{for } 1/2 \le y < \infty \end{cases} \end{align} }

shows that the conditional distribution of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} given Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X=0.75} consists of two atoms, at 0.25 and 0.5, of probabilities 1/3 and 2/3 respectively.

Similarly, the conditional distribution may be calculated for all Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} in Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (0,0.5)} or Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (0.5,1).}

The value is an atom of the distribution of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} , thus, the corresponding conditional distribution is well-defined and may be calculated by elementary means (the denominator does not vanish); the conditional distribution of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} given Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X=0.5} is uniform on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (2/3,1).} Measure theory leads to the same result.

The mixture of all conditional distributions is the (unconditional) distribution of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} .

The conditional expectation Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E(Y|X=x)} is nothing but the expectation with respect to the conditional distribution.

In the case Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f=f_2} the corresponding Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle F_{Y|X=x}(y)=P(Y\le y|X=x)} probably cannot be calculated explicitly. For a given Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y} it is well-defined (via the Hilbert space or the Radon-Nikodym derivative) as an equivalence class of functions (of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x} ). The right choice of functions within these equivalence classes may be made as above; it leads to correct conditional cumulative distribution functions, thus, conditional distributions. In general, conditional distributions need not be atomic or absolutely continuous (nor mixtures of both types). Probably, in the considered example they are singular (like the Cantor distribution).

Once again, the mixture of all conditional distributions is the (unconditional) distribution, and the conditional expectation is the expectation with respect to the conditional distribution.

See also

Notes

  1. Pollard 2002, Sect. 5.5, Example 17 on page 122
  2. Durrett 1996, Sect. 4.1(a), Example 1.6 on page 224
  3. 3.0 3.1 3.2 3.3 Pollard 2002, Sect. 5.5, page 122 Cite error: Invalid <ref> tag; name "Pollard-5.5-122" defined multiple times with different content Cite error: Invalid <ref> tag; name "Pollard-5.5-122" defined multiple times with different content
  4. Proof:
    Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} & \mathbb{E} ( I - g(X) )^2 = \\ & = \int_0^{1/3} (1-g(3y))^2 \, \mathrm{d}y + \int_{1/3}^{2/3} g^2 (1.5(1-y)) \, \mathrm{d}y + \int_{2/3}^1 g^2 (0.5) \, \mathrm{d}y = \\ & = \int_0^1 (1-g(x))^2 \frac{ \mathrm{d}x }{ 3 } + \int_{0.5}^1 g^2(x) \frac{ \mathrm{d} x }{ 1.5 } + \frac13 g^2(0.5) = \\ & = \frac13 \int_0^{0.5} (1-g(x))^2 \, \mathrm{d}x + \frac13 g^2(0.5) + \frac13 \int_{0.5}^1 ( (1-g(x))^2 + 2g^2(x) ) \, \mathrm{d}x \, ; \end{align} }
    it remains to note that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (1-a)^2+2a^2} is minimal at Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle a=1/3.}
  5. Proof: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} & \mathbb{E} ( Y - h_1(X) )^2 = \int_0^1 \Big( y - h_1 ( f_1(x) ) \Big)^2 \, \mathrm{d}y = \\ & \int_0^{1/3} (y-h_1(3y))^2 \, \mathrm{d}y + \int_{1/3}^{2/3} \Big( y - h_1( 1.5(1-y) ) \Big)^2 \, \mathrm{d}y + \int_{2/3}^1 \Big( y - h_1(0.5) \Big)^2 \, \mathrm{d}y = \\ & \int_0^1 \Big( \frac x 3 - h_1(x) \Big)^2 \frac{ \mathrm{d}x }{ 3 } + \int_{0.5}^1 \Big( 1 - \frac{x}{1.5} - h_1(x) \Big)^2 \frac{ \mathrm{d} x }{ 1.5 } + \frac13 h_1^2(0.5) - \frac 5 9 h_1(0.5) + \frac{19}{81} = \\ & \frac13 \int_0^{0.5} \Big( h_1(x) - \frac x 3 \Big)^2 \, \mathrm{d}x + \frac13 h_1^2(0.5) - \frac 5 9 h_1(0.5) + \frac{19}{81} + \\ & \quad \frac13 \int_{0.5}^1 \bigg( \Big( h_1(x) - \frac x 3 \Big)^2 + 2 \Big( h_1(x) - 1 + \frac{2x}3 \Big)^2 \bigg) \, \mathrm{d}x \, ; \end{align} } it remains to note that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \textstyle (a-\frac x 3)^2 + 2(a-1+\frac{2x}3)^2 } is minimal at Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \textstyle a = \frac{2-x}3, } and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \textstyle \frac13 a^2 - \frac 5 9 a } is minimal at Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \textstyle a = \frac 5 6. }

References

  • Durrett, Richard (1996), Probability: theory and examples (Second ed.)
  • Pollard, David (2002), A user's guide to measure theoretic probability, Cambridge University Press

Category:Probability theory