The
probability <math>P</math> of some
event <math>E</math> (denoted <math>P(E)</math>) is defined with respect to a "universe" or
sample space <math>S</math> of all possible
elementary events in such a way that <math>P</math> must satisfy the Kolmogorov axioms.
Alternatively, a probability can be interpreted as a measure on a sigma-algebra of subsets of the sample space, those subsets being the events, such that the measure of the whole set equals 1. This property is important, since it gives rise to the natural concept of conditional probability. Every set <math>A</math> with non-zero probability defines another probability on the space:
- <math>P(B \vert A) = {P(B \cap A) \over P(A)}.</math>
This is usually read as "probablity of <math>B</math> given <math>A</math>". <math>B</math> and <math>A</math> are said to be
independent if the conditional probability of <math>B</math> given <math>A</math> is the same as the probability of <math>B</math>.
In the case that the sample space is finite or countably infinite, a probability function can also be defined by its values on the elementary events <math>\{e_1\}, \{e_2\}, ...</math> where <math>S = {e_1, e_2, ...}</math>
Kolmogorov axioms
For any set <math>E</math>:
- <math>0 \leq P(E) \leq 1.</math>
That is, the probability of an event set is represented by a real number between 0 and 1.
- <math>P(S) = 1</math>.
That is, the probability that some elementary event in the entire sample set will occur is 1, or certainty. More specifically, there are no elementary events outside the sample set. This is often overlooked in some mistaken probability calculations; if you cannot precisely define the whole sample set, then the probability of any subset cannot be defined either.
Any sequence of mutually disjoint events <math>E_1, E_2, ...</math> satisfies
- <math>P(E_1 \cup E_2 \cup \cdots) = \sum P(E_i)</math>.
That is, the probability of an event set which is the union of other disjoint subsets is the sum of the probabilities of those subsets. This is called σ-additivity. If there is any overlap among the subsets this relation does not hold.
These axioms are known as the Kolmogorov axioms, after Andrey Kolmogorov who developed them.
Lemmas in probability
From these axioms one can deduce other useful rules for calculating probabilities. For example:
- <math>P(A \cup B) = P(A) + P(B) - P(A \cap B)</math>
That is, the probability that A or B will happen is the sum of the
probabilities that A will happen and that B will happen, minus the
probability that A and B will happen.
- <math>P(S - E) = 1 - P(E)</math>
That is, the probability that any event will not happen is 1 minus the probability that it will.
Using conditional probability as defined above, it also follows immediately that:
- <math>P(A \cap B) = P(A) \cdot P(B \vert A)</math>
That is, the probability that A and B will happen is the probability
that A will happen, times the probability that B will happen given
that A happened. It then follows that A and B are independent if and only if
- <math>P(A \cap B) = P(A) \cdot P(B)</math>.
See also
frequency probability -- personal probability -- eclectic probability -- statistical regularity
All Wikipedia text
is available under the
terms of the GNU Free Documentation License