Everipedia Logo
Everipedia is now IQ.wiki - Join the IQ Brainlist and our Discord for early access to editing on the new platform and to participate in the beta testing.
Independent and identically distributed random variables

Independent and identically distributed random variables

In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent.[1] This property is usually abbreviated as i.i.d. or iid or IID. Herein, i.i.d. is used, because it is the most prevalent.

Introduction

In statistics, it is commonly assumed that observations in a sample are effectively i.i.d.. The assumption (or requirement) that observations be i.i.d. tends to simplify the underlying mathematics of many statistical methods (see mathematical statistics and statistical theory). In practical applications of statistical modeling, however, the assumption may or may not be realistic.[2] To partially test how realistic the assumption is on a given data set, the autocorrelation can be computed, lag plots drawn or turning point test performed.[3] The generalization of exchangeable random variables is often sufficient and more easily met.

The i.i.d. assumption is important in the classical form of the central limit theorem, which states that the probability distribution of the sum (or average) of i.i.d. variables with finite variance approaches a normal distribution.

Often the i.i.d. assumption arises in the context of sequences of random variables. Then "independent and identically distributed" implies that an element in the sequence is independent of the random variables that came before it. In this way, an i.i.d. sequence is different from a Markov sequence, where the probability distribution for the nth random variable is a function of the previous random variable in the sequence (for a first order Markov sequence). An i.i.d. sequence does not imply the probabilities for all elements of the sample space or event space must be the same.[4] For example, repeated throws of loaded dice will produce a sequence that is i.i.d., despite the outcomes being biased.

Definition

Definition for two random variables

Suppose that the random variablesandare defined to assume values in. Letandbe thecumulative distribution functionsofand, respectively, and denote theirjoint cumulative distribution functionby.
Two random variablesandare identically distributedif and only if[5].
Two random variablesandare independent if and only if. (See furtherIndependence (probability theory) § Two random variables.)
Two random variablesandare i.i.d. if they are independent and identically distributed, i.e. if and only if

Definition for more than two random variables

The definition extends naturally to more than two random variables. We say thatrandom variablesare i.i.d. if they are independent (see furtherIndependence (probability theory)#More than two random variables) and identically distributed, i.e. if and only if
wheredenotes the joint cumulative distribution function of.

Examples

The following are examples or applications of i.i.d. random variables:

  • A sequence of outcomes of spins of a fair or unfair roulette wheel is i.i.d. One implication of this is that if the roulette ball lands on "red", for example, 20 times in a row, the next spin is no more or less likely to be "black" than on any other spin (see the Gambler's fallacy).

  • A sequence of fair or loaded dice rolls is i.i.d.

  • A sequence of fair or unfair coin flips is i.i.d.

  • In signal processing and image processing the notion of transformation to i.i.d. implies two specifications, the "i.d." (i.d. = identically distributed) part and the "i." (i. = independent) part: (i.d.) the signal level must be balanced on the time axis; (i.) the signal spectrum must be flattened, i.e. transformed by filtering (such as deconvolution) to a white noise signal (i.e. a signal where all frequencies are equally present).

Generalizations

Many results that were first proven under the assumption that the random variables are i.i.d. have been shown to be true even under a weaker distributional assumption.

Exchangeable random variables

The most general notion which shares the main properties of i.i.d. variables are exchangeable random variables, introduced by Bruno de Finetti. Exchangeability means that while variables may not be independent, future ones behave like past ones – formally, any value of a finite sequence is as likely as any permutation of those values – the joint probability distribution is invariant under the symmetric group.

This provides a useful generalization – for example, sampling without replacement is not independent, but is exchangeable.

Lévy process

In stochastic calculus, i.i.d. variables are thought of as a discrete time Lévy process: each variable gives how much one changes from one time to another. For example, a sequence of Bernoulli trials is interpreted as the Bernoulli process. One may generalize this to include continuous time Lévy processes, and many Lévy processes can be seen as limits of i.i.d. variables—for instance, the Wiener process is the limit of the Bernoulli process.

See also

  • De Finetti's theorem

References

[1]
Citation Linktuvalu.santafe.eduClauset, Aaron (2011). "A brief primer on probability distributions" (PDF). Santa Fe Institute.
Sep 24, 2019, 2:04 AM
[2]
Citation Link//doi.org/10.2307%2F3315772Hampel, Frank (1998), "Is statistics too difficult?", Canadian Journal of Statistics, 26: 497–513, doi:10.2307/3315772 (§8).
Sep 24, 2019, 2:04 AM
[3]
Citation Linkinfoscience.epfl.chLe Boudec, Jean-Yves (2010). Performance Evaluation Of Computer And Communication Systems (PDF). EPFL Press. pp. 46–47. ISBN 978-2-940222-40-7.
Sep 24, 2019, 2:04 AM
[4]
Citation Linkopenlibrary.orgCover, T. M.; Thomas, J. A. (2006). Elements Of Information Theory. Wiley-Interscience. pp. 57–58. ISBN 978-0-471-24195-9.
Sep 24, 2019, 2:04 AM
[5]
Citation Linkopenlibrary.orgCasella, George; Berger, Roger L. (2002), Statistical Inference, Duxbury Advanced Series, Theorem 1.5.10
Sep 24, 2019, 2:04 AM
[6]
Citation Linktuvalu.santafe.edu"A brief primer on probability distributions"
Sep 24, 2019, 2:04 AM
[7]
Citation Linkdoi.org10.2307/3315772
Sep 24, 2019, 2:04 AM
[8]
Citation Linkinfoscience.epfl.chPerformance Evaluation Of Computer And Communication Systems
Sep 24, 2019, 2:04 AM
[9]
Citation Linken.wikipedia.orgThe original version of this page is from Wikipedia, you can edit the page right here on Everipedia.Text is available under the Creative Commons Attribution-ShareAlike License.Additional terms may apply.See everipedia.org/everipedia-termsfor further details.Images/media credited individually (click the icon for details).
Sep 24, 2019, 2:04 AM