Basics
Probability Basics
Most of the mathematics of investing concerns probabilities–the “chance” of a thing occurring. So we’ll start here.
A case study: flipping a coin
You’re likely already familiar with this concept: what is the chance of a coin flip resulting in a “heads” up landing? Intuitively, you’d say “50%”. This is a very informal, intuitive understanding of probability that most of us have, and that we encounter often in our daily lives.
Formalizing the coin flip
Mathematics depends on something more rigorous than intuition, though. We can formalize the concept of probability in the language of mathematics.
Coins are two-sided: one is heads, the other tails. We flip it in such a way that we guarantee it will land either heads-up or tails-up–never on its edge–and that there is no bias so that either heads or tails is favored.
We can call the result of the coin flip an outcome. So the outcome of any given coin flip is either “heads” or “tails”.
When we flip the coin, what we have done is called an experiment or trial.
Now, before we flip the coin we don’t know what the outcome will be. It could be heads. It could be tails. But we know that those are the only two possible outcomes.
Here it’s going to get a little “mathy”, but the descriptions should provide clarity regarding how to read the mathematics below.
We denote a specific outcome by using a capital ‘O’ and a subscript with the outcome label. For example, to denote the “heads up” outcome of a coin flip, we use \(O_{heads}\) and for “tails up” we use \(O_{tails}\). When we want to represent any single outcome, but not a specific one, we simply use \(O\).
We denote the probability of a specific outcome with a capital ‘P’ followed by a parenthesis with the outcome label. For example, the probability of the “heads up” outcome would be denoted \(P(O_{heads})\)
When we want to represent a generic number of things, such as a number of outcomes, we use a capital ‘N’: \(N\). We can make this more specific, e.g., to write “the total number of outcomes” we would use \(N_{outcomes}\)
Since all the possible outcomes of our coin flip are equally likely, we can find the probability that a specific single outcome occurs by dividing 1 by the total number of possible outcomes:
$$ P(O) = \frac{1}{N_{outcomes}} $$Let’s use our understanding of coin flips to write the probability for “heads” and for “tails” outcomes.
There are 2 possible outcomes: “heads” or “tails”. So:
$$ N_{outcomes} = 2 $$ $$ P(O_{heads}) = \frac{1}{N_{outcomes}} = \frac{1}{2} = 50\% $$ $$ P(O_{tails}) = \frac{1}{N_{outcomes}} = \frac{1}{2} = 50\% $$Note that when we add all the probabilities of individual outcomes up, we get 1!
$$ P(O_{heads}) + P(O_{tails}) = \frac{1}{2} + \frac{1}{2} = 1 $$When you add the separate probability values for all specific, independent outcomes, the total probability is always 1.
We assumed that both possible outcomes of the coin flip were equally likely. We’d call such a coin a “fair” coin.
But the coin doesn’t have to be fair. Maybe one side is made of slightly more dense metals, or maybe the “heads” figure depicted on the coin causes the “heads” side to tend to be down more often than not.
We can account for this by weighting the outcomes. Often we will presuppose the weights, or else determine them by experiment. In either case, the weights are expressed as fractions, and the sum of all weights must add up to 1.
Suppose the “tails” side of a coin is slightly denser than the “heads” side, so that “heads” comes up 55% of the time instead of 50%.
Since there are only two possible outcomes, the outcome for tails must by definition be 1 minus the outcome for heads:
\(P(O_{heads}) = \frac{55}{100} = 0.55\) and \(P(O_{tails}) = 1 - P(_{heads}) = 1 - \frac{55}{100} = \frac{45}{100} = 0.45\)
Analyzing Probability Problems
In these kinds of basic probability problems, perhaps the hardest part is making two determinations:
- What all the possible outcomes are.
- Whether each individual outcome has the same likelihood of occurring as the others. If each outcome isn’t as likely, what are the the likelihoods of each outcome?
Let’s run through an example, using a 6-sided die.
Dice have a number of sides, most commonly 6. In this example, we’ll roll a single fair die.
If we want to discuss probabilities related to rolling this die, we must identify all the possible outcomes. There are 6: 1 for each side of the die.
We must also determine whether each outcome is equally likely as the others. In this case, since the die is “fair”, all 6 outcomes are equally likely. Therefore, the probability of getting any particular number on a roll is \(P(O_{any\ number}) = \frac{1}{6} = 16.7\%\)
What if our 6 sided die is instead weighted so that the 6 is rolled 20% (\(\frac{2}{5}\)) of the time instead of ~16.7% (\(\frac{1}{6}\)) of the time, but otherwise all the other sides are equally likely?
In this case, the possible outcomes are still 1 through 6.
However, the outcomes aren’t all equally likely. The probability to get a 6 is 20% (by definition). This means the probability to get any other number is 80% (\(1 - 20\% = 80\%\)). Since there are 5 other possible numbers, and they’re all equally likely, the probability to get one of them is \(P(O_{1-5}) = \frac{80\%}{5} = 16\%\)
In the above discussion about coin flips and rolling a die, we had distinct outcomes (e.g. a “heads” coin flip or a roll of a 2 on a die) that matched the physical “space” of the system exactly:
- 2 sides of a coin, 2 possible outcomes (heads or tails)
- 6 sides of a die, 6 possible outcomes (a number between 1 and 6, inclusive)
Sometimes this isn’t the case. We need to be able to account for these cases. For example, suppose we have a bag filled with marbles. There are 5 green marbles, 3 red marbles, and 2 blue ones. We want to discuss probabilities related to pulling a single random marble out of the bag.
In this case there are only 3 possible outcomes: a red marble, a blue marble, or a green marble. However, there are 10 marbles to pull. So we don’t have an exact “1 to 1 mapping” of the possible outcomes to the “space” of marbles to pick. We need additional concepts and language to discuss probabilities in this context.
Sets and Elements
In the bag of marbles example, we can think of the bag as holding a “set” of marbles:
\(S_{marbles} = \{green, green, green, green, green, red, red, red, blue, blue\}\)
Each marble in the “set” is an element of the set.
When we discuss these kinds of sets, where there are distinct elements we can physically count and some of them have the same label, we can identify the probability of randomly choosing an element with a specific label by counting the number of times elements with that label appear in the set, and dividing by the number of elements in total in the set:
\(P(O_{element}) = \frac{C_{label}}{N_{elements}}\)
There are 10 marbles in the bag, so there are 10 elements. This lets us immediately identify the probability of picking a marble with a specific color out of the bag:
\(P(O_{red}) = \frac{C_{red}}{10} = \frac{3}{10}\)
\(P(O_{green}) = \frac{C_{green}}{10} = \frac{5}{10}\)
\(P(O_{blue}) = \frac{C_{blue}}{10} = \frac{2}{10}\)
Notice that the total probability still adds up to 1:
\(P(O) = P(O_{red}) + P(O_{green}) + P(O_{blue}) = \frac{3}{10} + \frac{5}{10} + \frac{2}{10} = \frac{10}{10} = 1\)
Repeated Experiments
The above examples–coinflips, dice, marbles in a bag–can all be tied together to discuss repeated experiments.
For example, a single coinflip is an “experiment.” What if we flipped the coin twice, and want to know the probability of getting heads at least once?
We can use the notion of sets to determine this answer. We can list all the outcomes as a set. We’ll represent a result by an H if the result is a heads, and T if it’s a tails. The outcomes are:
- HH (heads twice)
- HT (heads, then tails)
- TH (tails, then heads)
- TT (tails twice)
And we can write this in our set notation like so: \(S = \{HH, HT, TH, TT\}\)
This means the set has four elements (we can simply count them). Three of the elements have at least one heads. So the probability of getting heads at least once is \(P(O_{at\ least\ one\ H}) = \frac{3}{4}\).
We can also ask “what is the probability of getting heads exactly once?”
In this case, heads occurs exactly once in just two elements: HT and TH. The probability is \(P(O_{exactly\ 1\ H}) = \frac{2}{4} = 50\%\)
Exercises
-
What is the probability of getting two heads in the above example?
-
What is the probability of getting tails on the second flip?
Chance of Something Not Happening
In most of the above material, we’ve asked questions from a “positive” outcome perspective: what is the chance of something occurring. We can also think of it from the “negative” perspective: what is the chance of something not occurring.
In all our examples, the total of all probabilities adds up to 1, every time. This is always true.
As a result, we can think of a “not” outcome as being a sort of inverse of the “positive” outcome:
\(P(O) = 1 - P(not\ O)\)
For example, in the “bag of marbles” problem from above, we can ask “what is the probability of picking a marble that is not red?”
There are two ways we can do this:
- add up all the probabilities for each “not red” result, i.e., \(P(O_{green}) + P(O_{blue}) = \frac{5}{10} + \frac{2}{10} = \frac{7}{10} = 70\%\)
- find the probability of a red marble, then subtract it from 1: \(P(O_{not\ red}) = 1 - P(O_{red}) = 1 - \frac{3}{10} = \frac{7}{10}\)
Exercises
-
What is the probability of not getting a heads in the two coinflip example?
-
What is the probability of not getting a tails on the first flip in the two coinflip example?
More on Sets
We can see from the above examples that the concepts of “sets” and “elements” is quite useful.
In fact, we can subdivide any probability problem that is cast in these terms into smaller sets.