Annex A. Understanding explained and unexplained differences between two groups through a counter-factual exercise: The Oaxaca- Blinder decomposition

In the early 1970s, Oaxaca and Blinder popularised a framework for decomposing differences between two groups attributed to observable and non-observable characteristics. A typical application of the model is the creation of a counterfactual that divides any observed gap between two exclusive sub-groups into components that are observed as characteristics of individuals and a component that contributes to the difference in the structure of outcome variables (Fortin, Lemieux and Firpo, 2011[1]). Since then, the Oaxaca-Blinder decomposition has been one of the most widely used models for understanding what may be attributed to observable and non-observable characteristics between two groups. A simplified version of their model decomposes intergroup differences in two parts. The decomposition aims to understand what part of the differences in the mean outcomes of each group: R=EYa-E(Yb) where Y are expected outcome variables for groups a and b.

We can apply a linear estimation form and model assumptions to the differences between both groups and generate the following for our reference groups A and B:

R=Y-a- Y-b=(X-A - X-B )'β^B + X-'B β^A-β^B + (X-A - X-B )' β^A-β^B

which gives us three components. The first component is the difference between observable predictors (“endowments”). The second part is the difference between coefficients (“coefficients effect”). The last component is the interaction effect, which is the difference simultaneously attributed between the two groups. The coefficients effect is the outcome that measures the expected change in group B’s mean outcome if group B had group A’s coefficients. If we applied this to male-female wage gaps, the coefficient effect would measure the mean outcome of women, if women had the same attributes as men. The second and third parts of the decomposition are often referred to as the unexplained differences between groups. Most applications of this method have been used to look at differences in gender wage gaps but have also been used for differences between ethnicity, union membership and immigrant status in the labour economics literature. It has also been extended to analysis in gaps in test scores, schools and countries. The decomposition has some similar attributes to the programme evaluation literature, as it generates counterfactual interpretation through the assignment of a “treatment” as the unobservable component of the decomposition, but falls short of fully understanding the mechanisms under which discrimination, or unobserved differences, occurs (Fortin, Lemieux and Firpo, 2011[1]; Jann, 2008[2]; Oaxaca, 1973[3]).

Metadata, Legal and Rights

This document, as well as any data and map included herein, are without prejudice to the status of or sovereignty over any territory, to the delimitation of international frontiers and boundaries and to the name of any territory, city or area. Extracts from publications may be subject to additional disclaimers, which are set out in the complete version of the publication, available at the link provided.

© OECD 2022

The use of this work, whether digital or print, is governed by the Terms and Conditions to be found at