# the odds ratio

In statistics, the odds ratio[1][2][3] (usually abbreviated “OR”) is one of three main ways to quantify how strongly the presence or absence of property A is associated with the presence or absence of property B in a given population. If each individual in a population either does or does not have a property “A”, (e.g. “high blood pressure”), and also either does or does not have a property “B” (e.g. “moderate alcohol consumption”) where both properties are appropriately defined, then a ratio can be formed which quantitatively describes the association between the presence/absence of “A” (high blood pressure) and the presence/absence of “B” (moderate alcohol consumption) for individuals in the population. This ratio is the odds ratio (OR) and can be computed following these steps:

1. For a given individual that has “B” compute the odds that the same individual has “A”
2. For a given individual that does not have “B” compute the odds that the same individual has “A”
3. Divide the odds from step 1 by the odds from step 2 to obtain the odds ratio (OR).

The term “individual” in this usage does not have to refer to a human being, as a statistical population can measure any set of entities, whether living or inanimate.

If the OR is greater than 1, then having “A” is considered to be “associated” with having “B” in the sense that the having of “B” raises (relative to not-having “B”) the odds of having “A”. Note that this is not enough to establish that B is a contributing cause of “A”: it could be that the association is due to a third property, “C”, which is a contributing cause of both “A” and “B” (Confounding).

The two other major ways of quantifying association are the risk ratio (“RR”) and the absolute risk reduction (“ARR”). In clinical studies and many other settings, the parameter of greatest interest is often actually the RR, which is determined in a way that is similar to the one just described for the OR, except using probabilities instead of odds. Frequently, however, the available data only allows the computation of the OR; notably, this is so in the case of case-control studies, as explained below. On the other hand, if one of the properties (say, A) is sufficiently rare (the “rare disease assumption“), then the OR of having A given that the individual has B is a good approximation to the corresponding RR (the specification “A given B” is needed because, while the OR treats the two properties symmetrically, the RR and other measures do not).

In a more technical language, the OR is a measure of effect size, describing the strength of association or non-independence between two binary data values. It is used as adescriptive statistic, and plays an important role in logistic regression.

In statistics and epidemiology, relative risk (RR) is the ratio of the probability of an event occurring (for example, developing a disease, being injured) in an exposed group to the probability of the event occurring in a comparison, non-exposed group. Relative risk includes two important features: (i) a comparison of risk between two “exposures” puts risks in context, and (ii) “exposure” is ensured by having proper denominators for each group representing the exposure [1][2]

$RR= \frac {p_\text{event when exposed}}{p_\text{event when non-exposed}}$
Risk Disease status
Present Absent
Smoker $a$ $b$
Non-smoker $c$ $d$

Consider an example where the probability of developing lung cancer among smokers was 20% and among non-smokers 1%. This situation is expressed in the 2 × 2 table to the right.

Here, a = 20, b = 80, c = 1, and d = 99. Then the relative risk of cancer associated with smoking would be

$RR=\frac {a/(a+b)}{c/(c+d)} = \frac {20/100}{1/100} = 20.$

Smokers would be twenty times as likely as non-smokers to develop lung cancer.

Another term for the relative risk is the risk ratio because it is the ratio of the risk in the exposed divided by the risk in the unexposed. Relative risk contrasts with the actual orabsolute risk, and may be confused with it in the media or elsewhere.