Chapter 4. Establishing a profile of disadvantaged youth (Module 2)

Having defined dimensions of youth well-being and identified indicators, it is then important to be able to identify disadvantaged youth – those at risk for experiencing or those already experiencing well-being deficits. One way to construct a profile of disadvantaged youth is to look at the potential causes and drivers of youth well-being deficits, and then to conduct descriptive and regression analyses to determine whether and to what extent those factors are actually correlated with deficits in youth well-being. This section explains different methods to understand causal linkages.


Potential drivers of deficits in youth well-being

Many factors can lead to youth well-being deprivation. Country-level descriptive and regression data analysis can consider three distinct kinds of risk factor: i) individual characteristics and behaviours; ii) family or household characteristics; and iii) community characteristics (see Figure 1.1 and Table 4.6 at the end of the Module for some examples). Other factors, such as policies and macroeconomic conditions, can also have an impact on youth well-being, but assessing their contribution requires other tools. They are thus considered in Module 3.

Individual characteristics and behaviours

The interplay between youth environments and inherent attributes, such as biological conditions (age, gender, ethnicity and race, among others), can be a strong determinant of youth well-being. Other individual characteristics, especially those related to health conditions, such as physical disability, mental health disorders and poor nutrition, are highly detrimental to youth well-being and can seriously affect future outcomes.

Age. Adolescents and young adults face different challenges. Adolescents may engage in a number of risky behaviours, including early sexual initiation, unprotected sex and lack of contraceptive use, which can translate into immediate negative outcomes, such as adolescent pregnancy and sexually transmitted infections (STIs). Adolescence is also an age when individuals are easily influenced by peers and may adopt criminal and violent behaviours that put them at risk for arrest and incarceration. In addition, those who leave school early have lower chances of successfully transitioning to the labour market and accessing decent work. Older youth confront additional challenges, especially drug and alcohol addiction due to substance abuse, as well as idleness (neither studying nor working) and informal employment.

Gender. Girls tend to have less access to education than boys, especially at the secondary and tertiary levels. The situation is more mixed regarding learning achievements. Girls generally out-perform boys in reading, while the reverse is true in mathematics (United Nations Educational, Scientific and Cultural Organization [UNESCO], 2012). Young women usually face greater difficulties completing the school-to-work transition and securing stable jobs than young men, and they more often end up in unemployment or being inactive without studying (Elder and Koné, 2014; International Labour Organisation [ILO], 2013a; Rosati, Ranzani and Manacorda (2015). Globally, young women are less affected by mortality and morbidity than young men but are disadvantaged in many other respects. Maternal conditions, for instance, are one of the leading causes of death among adolescent girls (World Health Organization [WHO], 2014).

Ethnicity. Young people from ethnic minorities are generally less likely to be enrolled in, attend and complete education. They are more likely to drop out of school and tend to have lower academic performance. They usually start the school-to-work transition earlier but have more difficulties completing the transition and securing stable jobs. Ethnic minority youth are, in addition, more exposed to violence and social marginalisation.

Health conditions and disability. Bad health conditions constitute a major obstacle to youth well-being. Physical disability is not only a negative outcome per se, but an important risk factor for other dimensions of well-being. Physical disability clearly limits youth access to education and employment and pushes many of them towards idleness. Mental health problems are also often associated with school drop-out, grade repetition and poor academic performance, as well as with risky behaviours, such as unprotected sex. Youth with mental health disorders have more difficulties developing their cognitive and non-cognitive skills and are more likely to attempt suicide. Poor nutrition, especially in the early years and in the transition from childhood to adulthood, significantly damages physical and psychical development and brings with it numerous health risks, including underweight, iron-deficiency anaemia, chronic diseases (e.g. diabetes, cancer) and premature death.

Aside from personal attributes, youth well-being is also highly dependent on an individual’s behaviours, in particular those related to sexual and reproductive health (SRH), substance abuse, education and employment, and crime and violence.

SRH. Early sexual activity and unprotected sex can give rise to several negative outcomes, including STIs, unintended pregnancy, early childbearing, unsafe abortion and maternal death. Pregnancy during adolescence carries a number of risks for both the mother and the child. For the mother, health problems that can arise due to early pregnancy range from anaemia, malaria, HIV and other STIs to postpartum haemorrhage and mental disorders, such as depression. For the child, the risks are mainly stillbirth and death, pre-term birth, low birth weight and asphyxia, in particular when pregnant adolescents use tobacco and alcohol. Adolescent mothers are time constrained as a result of their family responsibilities. This is particularly true when they raise their children alone, a situation more common among adolescents than older women. As a consequence, adolescent mothers are likely to drop out of school or have low academic performance and a posteriori become inactive or enter the labour market to occupy flexible but less rewarding jobs in the informal sector or as self-employed.

Substance abuse. Tobacco, alcohol and illicit drug use undermine both current and future health, as well as other dimensions of youth well-being. Substance users can quickly become addicted and face numerous problems, from cognitive and educational difficulties, including poor academic performance, school absenteeism and early drop-out to low self-esteem and mental disorders, which may lead to suicide attempts. These individuals are often exposed to other risky behaviours and negative outcomes, such as malnutrition, unprotected sex, STIs, and violence and injuries (both as victims and perpetrators). They have, in addition, a lower life expectancy, since they are more predisposed in adulthood to non-communicable diseases and premature death.

Disengagement from school and child labour. Disengagement from school ranges from school absenteeism, poor academic performance and grade repetition to school drop-out. Youth who leave school early have lower human capital endowments and reduced employment opportunities, and may fall into inactivity or get into low-quality and unproductive jobs offering poor earnings. Disengagement can also lead to social marginalisation and exclusion. Out-of-school youth are less connected to society and less civically engaged and are more likely to experience other risky behaviours and negative outcomes, such as delinquency, crime and violence, early sexual initiation, risky sex and adolescent pregnancy, and substance abuse and other unhealthy practices. Some children or youth work while in school or simply drop out to work. Child labour or premature entry into the labour market – in particular, hazardous employment and the worst forms of child labour – can affect health and personal development, as well as future prospects of decent work. Idleness, defined as being neither studying nor working, is also a major concern among youth, potentially resulting in social isolation and triggering other risky behaviours that can further reduce their well-being (European Foundation for the Improvement of Living and Working Conditions [Eurofound], 2012).

Crime and violence. As observed, criminal and violent behaviours are often associated with other risky behaviours and negative outcomes, such as disengagement from school, idleness, substance abuse and unprotected sex. For perpetrators, it can ultimately lead to arrest, incarceration and loss of liberty. Violence and crime can take many forms, including physical fighting, bullying, intimate partner violence, non-fatal assault and homicide. Youth exposed to violence can suffer injuries, disability or even death, in the worst cases. Being the victim of violence constitutes a traumatic experience that can also result in mental disorders, cognitive and educational problems and behavioural changes, such as substance abuse, depression and suicide.

Family characteristics

Beyond personal attributes and behaviours, the family environments in which youth live have an impact on their well-being. Among other factors, household income poverty, low parental education and poor employment situation, lack of parental support, as well as domestic abuse and violence negatively influence youth well-being. Once they have left their parents, for instance to form their own families, youth may be confronted with other family-related issues, such as single parenthood or intimate partner violence.

Poverty. Poverty is a major constraint to youth development and well-being. Youth living in poor households are materially deprived; suffer from poor nutrition and housing conditions, with immediate and future negative consequences to their health; and are socially excluded. They have less access to education and tend to make the transition to the labour market earlier, although they often end up in unemployment or informal employment. Poor youth are more likely than others to be exposed to violence, either as perpetrators or victims, and to adopt risky behaviours, such as disengagement from school, substance abuse, early sexual initiation and unprotected sex. Child labour is also a common phenomenon among poor households due to the need for additional household income and the high opportunity cost of education. Low parental education and an adverse employment situation are usually correlated with household poverty (especially when labour income constitutes the main source of livelihood) and have similar detrimental effects on youth well-being.

Parental behaviours. Youth well-being is strongly influenced by the way parents behave and treat them. For instance, youth who lack parental support and connectedness or who suffer from domestic abuse and violence are more likely to drop out school, enter the labour market prematurely and develop harmful behaviours, such as early sexual initiation, unprotected sex, substance abuse and violence. They are also more likely to develop negative feelings, as well as mental health and cognitive problems.

Community characteristics

Characteristics of the community where youth live also exert an influence on their behaviours and well-being. These include, in particular, the area of residence (urban or rural), access to and quality of schools, community violence, negative peer influences, availability of adequate infrastructure and public services, as well as social and cultural norms (Table 4.1).

Table 4.1. Examples of inherent and environmental risk factors to youth well-being




Inherent characteristics

  • Age

  • Gender

  • Ethnicity/race

  • Migration

  • Disability

  • Mental health/low self-esteem

  • Poor nutrition

Risky behaviours

SRH: early sexual initiation, risky sex, adolescent pregnancy

Substance use: tobacco, alcohol, illicit drugs

Disengagement from school: school absenteeism, grade repetition, drop-out, poor academic performance

Employment: Idleness (NEET), child labour

Crime and violence: Violent/criminal behaviours, belonging to a violent/criminal group

  • Poverty/scarcity

  • Poor household amenities

  • Parental education

  • Single parenthood

  • Early parenthood

  • Lack of parental support

  • Lack of parental control

  • Domestic abuse/violence

  • Location: urban/rural, administrative area (district, province or region)

  • Access/quality of schools

  • Unsafe/poor neighbourhoods

  • Negative peer influences

  • Infrastructure and public services (health, SRH, information and communication technologies)

  • Social and cultural norms (gender, early marriage)

Box 4.1. Main risk factors to be taken into account in bivariate and multivariate analysis

Youth well-being is affected by a multitude of risk factors at micro, meso and macro levels. Not all of them can be taken into account in bivariate and multivariate analysis, either because they require other analytical tools (this is typically the case when a factor is invariant across individuals) or because they are not well measured in the data set used for the analysis.

The literature has nonetheless identified a certain number of risk factors that recurrently affect multiple dimensions of youth well-being and are more commonly available in existing datasets. Depending on their availability, the analysis of youth well-being should be conducted taking into account at least this minimal list of risk factors:

  • Individual characteristics: age, gender, ethnicity or race, level of education, health status or disability, marital status

  • Household characteristics: poverty/scarcity, parental education, family structure (living with parents or not, presence of children)

  • Community characteristics: urban (ideally capital city and other urban) and rural area of residence, region of residence.

Categories of youth most affected by well-being deficits

Descriptive statistics are the initial step of any data analysis. They can offer a first insight into who the disadvantaged youth are, based on the above individual, household and community characteristics. With descriptive statistics, it is possible to perform bivariate analysis, which allows describing the relationship between pairs of variables, such as dependent variables (well-being deficits) and independent variables (risk factors). Bivariate descriptive analysis does not permit deriving casual inference but rather determining if and to what extent two different variables are interrelated. Descriptive statistics in the case of bivariate analysis mainly include cross-tabulations, graphical representations and quantitative measures of dependence, such as correlations.


Cross-tabulations provide a basic picture of the relation between risk factors and well-being deficits. It basically displays the outcome of the dependent variable (well-being deficit) disaggregated for different subgroups as defined by the independent variable (risk factor). Usually, the dependent variable is displayed in columns and the risk factors in rows. From cross-tabulations, most at-risk youth can be approximated by identifying the risk factors associated with the highest outcomes of the well-being deficits.

Graphical representations

Two-way graphs are often preferred to, or analysed in conjunction with, cross-tabulations because they offer a better visualisation of the relationship between two variables. Different types of graphs exist. Histograms are used when the independent variable (risk factor) is a categorical variable, which is the case in most of the risk factors identified in this methodology. The magnitude of the well-being deficit is represented by a bar for the different subgroups considered. Most at-risk youth are therefore easily identifiable.


Tables and graphs provide a good illustration of youth most affected by well-being deficits. However, these descriptive statistics should be complemented by more robust measures that quantify and test the level of association between risk factors and well-being deficits.

Two measures of association are commonly used in the literature: the Sperman’s rank and the Pearson’s correlation coefficients. These correlation coefficients are indexes of the strength of association between two variables, which can range from 0 (no association) to 1 (perfect association). The sign of coefficients indicates the sense of the correlation. A positive sign means that the two variables are positively correlated; a negative sign means that they are negatively correlated. The value of coefficients indicates, in turn, the magnitude or strength of the correlation between the two variables. Once the correlation coefficient is obtained, it is recommended to test the statistical significance of the correlation. This can be done by using, for instance, the chi square statistic. If the correlation coefficients are statistically significant at a reasonable confidence level (95% or 99%), it is possible to conclude that the two variables are correlated.

These measures of association can be used to identify which of the risk factors are significantly positively correlated with well-being deficits and, in particular, which of the risk factors exhibit the stronger correlations. These results give us first insights into which youth are most at risk for experiencing well-being deficits.

Determinants of youth well-being deficits

While bivariate descriptive tools can give a gross approximation of the correlation between risk factors and well-being deficits, they need to be complemented with multivariate regression analysis. Multivariate analysis is commonly used to identify the determinants of youth disadvantage, as it provides more accurate measures of the correlation between risk factors and well-being deficits by taking into account multiple factors simultaneously and isolating the effect of each of them while controlling for the others.

The multivariate regression analysis can be conducted following five sequential steps:

  1. Select the data.

  2. Define the dependent variable (well-being deficit indicator).

  3. Determine the most appropriate model to estimate the dependent variable.

  4. Specify the independent variables (risk factors) to be included in the model.

  5. Identify the determinants of the well-being deficit by testing the statistical significance of the risk factors and quantifying their effects on the well-being deficit.

The rest of this section describes these steps in more details and then provides a few examples of simple regression analysis that can be conducted using a variety of available data sets to identify the determinants of youth well-being deficits in the areas of i) employment; ii) education and skills; iii) health; and iv) participation and empowerment.

While the examples have the advantage of relying on a rather simple methodology, they have some limitations. First, results obtained from regression analysis are highly dependent on both the type of model used and how it is specified (the explanatory variables selected). In other words, regression analysis does not allow knowing with certainty if and to what extent the risk factors considered really affect well-being but rather gives an indication of the potential determinants and their approximate effects. Second, simple regression models, such as those proposed below, do not permit establishing causal relationships between risk factors and well-being deficits. This can be done using more complex econometric techniques, such as treatment effects models. Third, with simple regression models, it is also not possible to deal with potential issues that can bias the results, such as endogeneity (when explanatory variables are correlated with the error term) and multi-collinearity (when two or more explanatory variables are correlated). These potential problems can be overcome by using more sophisticated methods, including the instrumental variable approach, that are beyond the scope of this toolkit.

Multivariate regression analysis

Step 1. Select the data

Employment. Information on employment can be found in labour force surveys or in other household surveys with an employment module. When analysing the school-to-work transition, panel data should be preferred, since they follow individuals over multiple time periods. In the absence of panel data, cross-section data with retrospective information on labour market history could be used, such as the ILO school-to-work transition surveys (SWTSs), which have been widely applied to analyse this phenomenon.

Education and skills. Population censuses and household surveys, including labour force surveys, are generally used to analyse education and skills. Learning achievements and cognitive skills in particular can be analysed through a number of international surveys, including the Organisation for Economic Co-operation and Development (OECD) Programme for International Student Assessment (PISA) and the Young Lives longitudinal study of child poverty.

Health. Health and SRH in particular are best analysed using population censuses and household surveys, such as the demographic and health surveys (DHS) and the Centers for Disease Control and Prevention (CDC)-assisted reproductive health surveys. Information on substance use can be found in specialised surveys, including the following two WHO-supported school-based surveys: the Global school-based student health surveys (GSHS) and the Health Behaviour in School-aged Children surveys (HBSC).

Box 4.2. Which type of data to use?

Household surveys constitute the main and richest source of information to measure deficits in youth well-being and analyse their determinants at the individual, family and community levels. Data from household surveys can broadly be classified into three different types: panel data, cross-section data with retrospective information, and cross-section data without retrospective information.

  • Panel data. Also known as longitudinal data, panel data are the best data to use in that they offer the highest possibilities in terms of analysis. In panel surveys, the same individuals are followed over multiple time periods, making possible a dynamic analysis of well-being over the youth life phase. In particular, panel data allow investigating the impacts of prior deficits and risk factors on youth well-being and their evolution over time. Panel surveys are more costly and difficult to implement than others surveys. For these reasons, panel surveys are usually scarce in developing countries.

  • Cross-section data with retrospective information. In the absence of panel data, a quasi-dynamic analysis is still possible with cross-section data containing retrospective information. Cross-section surveys differ from panel surveys in that individuals are only observed at a specific point in time. If multiple cross-sections exist, the investigation can be conducted for different points in time, but the individuals analysed will not be the same. A good example is the SWTS, which contains comprehensive information on the past situation and characteristics of surveyed individuals, with a focus on youth labour market history.

  • Cross-section data without retrospective information. Usually, cross-section surveys focus on the current situation of individuals and include very limited information on prior situation and characteristics. Conducting rigorous and detailed investigation on youth well-being is still possible with such surveys, but the analysis will be static and will not follow a life cycle approach.

Participation and empowerment. Data on social capital and civic and political engagement can be obtained from international surveys, such as the Gallup World Poll surveys and the World Values Surveys. Similar opinion polls conducted at the regional level, including Afrobarometer, Latinobarometer and Asianbarometer, constitute rich sources of information and hence must be exploited to document issues related to youth participation and empowerment. With regard to youth violence, cross-national information is difficult to retrieve, as it is for other risky behaviours, since it is only available in a few specialised international surveys covering a limited number of countries, such as the GSHS and HBSC surveys.

Step 2. Define the dependent variable

The dependent variables used are qualitative variables that take two or more values. In the case of a binary qualitative variable, the dependent variable takes value 1 for youth experiencing the well-being deficit and 0 for youth not affected by the well-being deficit. This coding is used to distinguish the outcome of interest (experiencing the well-being deficit), which is coded 1, from the reference or base outcome (not being affected by the well-being deficit), which is coded 0.

In the case of a qualitative variable with multiple categories, the dependent variable takes several values corresponding to different well-being outcomes. The dependent variable is not ordinal because well-being outcomes cannot be ranked from worst to best in an unquestionable way. However, some well-being outcomes are clearly more favourable than others. The outcome assumed to be the most favourable is considered the reference category and coded 0 accordingly. The way the other outcomes are ordered does not matter, since they are all analysed separately in comparison to the reference category.

Step 3. Select an appropriate estimation model

In the case of a binary qualitative dependent variable, models that can be used include linear and non-linear probability models, such as the logit and probit regression. These models allow deriving the partial effect of the independent variables – i.e. the effect of each independent variable, holding all the others fixed – on the probability that the outcome of interest has occurred.

Of the two, non-probability models are more popular in the empirical literature. The linear probability model, which is estimated using ordinary least square regression, is less adapted when dealing with a binary dependent variable because its predicted values can fall outside the unit interval (be negative or exceed 1) for some values of the explanatory variables. This is actually not very problematic if partial effects are estimated at the mean of the independent variables or if most independent variables are discrete and contain only a few values, such as dummies (Wooldridge, 2002).

Logit and probit models are basically the same: the dependent variable is generally modelled using a cumulative distribution function, and the coefficients are estimated by maximum likelihood. They only differ in that the error term does not have the same distribution (standard logistic in the logit model versus standard normal in the probit model). Both models can be used equally to analyse the determinants of a particular well-being deficit. When the dependent variable is qualitative and has more than two categories, multinomial non-linear probability models, such as the multinomial logit regression, are the most appropriate.

Step 4. Specify the independent variables of the model

Independent variables can be either continuous or categorical. The effect of a continuous independent variable on the dependent variable is not necessarily linear. This is the case, for example, for the variable ‘age’. When the relationship between the two variables is suspected to be non-linear, it is preferable to add in the model the square of the independent variable to evidence whether its effect decreases, remains constant or increases as the variable reaches higher values. The squared variable should be divided over a certain factor (for example, 100 in the case of age) so as to offset any effect of scale.

For interpretational purposes, categorical variables, such as sex and levels of education, must be introduced as dummies in the model (dichotomous variables coded 0/1). When the categorical variable has more than two categories, as is the case for levels of education, one of the categories must be left out to serve as the reference category to which the other categories are compared.

Step 5. Identify the determinants of the well-being deficit

Run the model and display the results. To identify the determinants of the well-being deficit, test the statistical significance of the coefficients of the independent variables and look at their sign. Wald tests can be performed to determine if each coefficient is statistically different from zero. If the results of the Wald tests show that coefficients are statistically significant at a reasonably confidence level (95% or 99%), then we can assume that the corresponding independent variables have an effect on the well-being deficit.

Focus only on independent variables with statistically significant coefficients and look at their sign. If the sign is positive, it means that the independent variable increases the likelihood of experiencing the well-being deficit. In that case, the independent variable can be considered a risk factor. Conversely, if the sign is negative, the likelihood is reduced, and the independent variable is assumed to be a protective factor. From the above, disadvantaged youth can be defined as all youth exhibiting risk factors, i.e. with characteristics that make them more likely to experience the well-being deficit.

In logit and probit models, as well as in other non-linear probability models, such as the multinomial logit, the value of coefficients is not meaningful. It is therefore necessary to compute partial or marginal effects to quantify the effects of risk factors on the well-being deficit. The marginal effect is computed separately for each independent variable and corresponds to the change in the probability of experiencing the well-being deficit induced by a marginal change in the independent variable.

The interpretation of the marginal effect differs depending on the nature of the independent variable. If the independent variable is a dummy, like gender (takes value 1 for female and 0 for male), then the marginal effect corresponds to the change in probability when the independent variable changes from 0 to 1. For example, in the case of gender, the marginal effect is the probability of experiencing the well-being deficit when the young individual is female as compared to male. If the independent variable is instead discrete with more than two values, like age, the marginal effect is then interpreted as the change in probability following a one-unit change in the independent variable, which, in the case of age, corresponds to the situation of being one year older. Lastly, when the independent variable is continuous (takes any value within a specified range), the marginal effect is defined as the change in probability of experiencing the well-being deficit induced by an infinitesimal change of the independent variable (1% change in practice).

Determinants of youth well-being deficits in employment

The analysis of the determinants of youth employment outcomes is illustrated in Table 4.2 through two examples: i) the transition from school to work; and ii) informality. Two different models are considered to identify the determinants of the school-to-work transition: one that analyses the factors behind incomplete school-to-work transition and another that looks at the different trajectories that youth can follow in the transition from school to work, in particular the situation of youth who become NEET. A third model is then described to analyse the determinants of informal employment among youth.

Table 4.2. Determinants of youth disadvantage: employment

Well-being deficits

Incomplete school-to-work transition

Selection into different school-to-work transition pathways

Informal employment

Type of data needed

SWTSs; labour force surveys or other household surveys with an employment module that includes information on job satisfaction

Panel data containing information on education and employment

Labour force surveys; school-to-work transitions surveys or other household surveys with an employment module that includes sufficient information to characterise formal and informal employment

Estimation model

Logit or probit model

Multinomial logit model

Logit or probit model

Dependent variable

Dummy variable taking value 1 for youth still in transition and 0 for youth having completed the transition

Categorical variable taking value 0 for early workers, 1 for students, 2 for working students, 3 for late workers, 4 for job-seekers and 5 for labour force drop-outs

Dummy variable taking value 1 for informally employed youth and 0 for formally employed youth

Independent variables

Individual characteristics: age, gender, education, migration, disability, employment history

Family characteristics: income poverty, low parental education, household head, family composition (presence of children)

Community characteristics: location, urban/rural area of residence

Individual characteristics: age, gender, ethnicity/race, low education level, poor health status or disability, immigration background, marital status

Family characteristics: income poverty; household composition (living with parents, presence of children, other youth or adults other than parents), difficult family environment (e.g. low parental education, bad employment situation, single parenthood)

Community characteristics: location, urban/rural area of residence

Individual characteristics: age (and its square, to take into account possible non-linear returns), gender, level of education, disability or health issues, marital status, migratory statuses

Family characteristics: income poverty, parental education.

Community characteristics: location, urban/rural area of residence

Model 1: Determinants of incomplete school-to-work transition

In this model, the ILO definition of school-to-work transition stages can be used to differentiate between youth in transition and youth who completed the transition (see, for instance, ILO, 2013a; Elder and Koné, 2014).

Youth in transition are in one of the following situations:

  • unemployed, according to the relaxed definition (not employed and available but not necessarily looking for work)

  • employed in a temporary and non-satisfactory job

  • employed in non-satisfactory self-employment

  • inactive non-students with future plans to work

Youth having completed the transition are in one of the following situations:

  • employed in a stable job, whether satisfactory or non-satisfactory

  • employed in a satisfactory but temporary job

  • employed in satisfactory self-employment.

Stable employment is defined as including all employees with a contract of employment, either written or oral, of more than 12 months’ duration. Employees with a contract of no more than a year are considered in temporary employment. Satisfactory and non-satisfactory employment are defined based on a subjective question asking workers whether they are satisfied (either very satisfied or somewhat satisfied, in the SWTSs) or not with their current jobs.

Individual, family and community characteristics, such as those reported in Table 4.2, have an impact on school-to-work completion. At the individual level, the model should at least include independent variables for age, gender and level of education attained or completed. Migratory and disability statuses should also be considered, among other relevant individual characteristics. If using SWTSs, some retrospective information on individuals’ labour market history should be added, such as the number of past spells of unemployment, temporary employment or self-employment (Shehu and Nilsson, 2014) or whether youth were working while in school (Rosati, Ranzani and Manacorda, 2015).

At the family level, poverty or scarcity needs to be taken into account whenever possible. This can be done by introducing a dummy variable for poverty status or, alternatively, by introducing several dummies identifying the different quantiles of the household income or consumption expenditure distribution in which youth are located. Parental education can be used as a proxy in the absence of information on household income or consumption expenditure. Other relevant family characteristics include household head status and the presence of children.

At the community level, location variables are important, given the large spatial disparities commonly documented in the literature. Dummies for urban/rural area of residence and regions are recommended in this regard.

Note that the literature proposes other interesting analyses and estimation methods relative to the school-to-work transition that will not be covered here. Rosati, Ranzani and Manacorda analyse simultaneously the probability that youth never transit to employment, either to a first job or to stable employment, and the duration of the transition among those who are expected to transition, using a split population model (2015). Elder and Koné propose, in turn, three different models estimated using the logit regression to analyse the probability of making the transition, transitioning into stable employment versus satisfactory temporary work or self-employment, and making a short transition (2014). Similarly, Shehu and Nilsson estimate three different models using the probit regression to analyse the probability of making the transition, transitioning into satisfactory temporary work or self-employment, and transitioning into stable employment (2014).

These last two models, be it estimated with a logit or probit regression, are good alternatives to the one suggested here. In fact, not all jobs held by youth who completed the school-to-work transition are of equal quality. It could be interesting, then, to run two additional models that distinguish jobs of higher quality from those of lower quality, such as stable employment versus satisfactory temporary work or self-employment. Analysing the length of the school-to-work transition provides further useful information on this phenomenon but has the major drawback of excluding youth who never went to school. In some developing countries, the latter amount, in fact, to a significant share of the total young population.

Model 2: Determinants of selection into different school-to-work transition pathways

This model requires panel data with successive samples or rotation groups in which individuals are observed on a monthly basis for a long period (e.g. 48 months) (OECD, 2013). In the absence of panel data, cross-section data with retrospective information on labour market history (e.g. from SWTSs) could be used.

The sample must be restricted to individuals defined as youth and enrolled in education at the beginning of the observation period. In each month of the observation period, these individuals must be classified into one of the following labour market states: i) in employment; ii) in education; iii) in employment and education; iv) not in employment; and v) inactive NEET.

The dependent variable is a categorical variable which has the following values:

  • 0 – for early workers: find employment relatively soon

  • 1 – for students: remain in education during the observation period

  • 2 – for working students: combine school and work

  • 3 – for late workers: find employment towards the end of the observation period

  • 4 – for job-seekers: enter unemployment

  • 5 – for labour force drop-outs: become inactive NEETs

Each of these categories corresponds to a different school-to-work pathway. The most successful pathway is the one followed by early workers who find employment relatively soon after leaving the educational system. This category is therefore considered the reference category and coded 0 accordingly.

The dependent variable is not ordinal because the different pathways cannot be ranked from worst to best in an unquestionable way. However, some pathways are clearly less favourable than others. This is undoubtedly the case for youth who do not access employment after leaving school and become unemployed (job-seekers) or, worse, inactive (labour force drop-outs). The primary focus of this model will be on these two last categories.

These pathways are defined after conducting a cluster analysis on individual trajectories based on the following two elements: i) their labour market state at the end of the trajectory; and ii) their transition paths (the relative frequency of each labour market state and the number of transitions between different states over the observation period). The cluster analysis is performed using the Ward’s hierarchical agglomeration algorithm (1963).

Some of the factors reported in Table 4.2 could be taken into account, since they are very likely to affect the probability of following a negative school-to-work pathway and in particular becoming unemployed or inactive non-students (OECD, 2013; Eurofound, 2012; Bills et al., 2014). As the analysis is conducted using panel data, it is recommended to add dummies to control for year or month effects. If the data allow it, some macro factors disaggregated at the local level could be added, such as fertility and youth unemployment rate (OECD, 2013).

Model 3: Determinants of informal employment

Youth are considered to be working informally if they belong to one of the following categories:

  • Formal sector enterprises: contributing family workers and informal employees

  • Informal sector enterprises: own-account workers, employers, contributing family workers, informal employees and members of producer co-operatives

  • Households: own-account workers and informal employees.

The informal sector is composed of unregistered businesses with a small number of workers (in practice, the threshold is usually set at five or ten), and informal employees are defined as those without access to at least one of the key employment benefits, such as annual paid leave, paid sick leave or social security contributions.

Factors displayed in Table 4.2 are believed to play a significant role in the probability of falling into informal employment (see, for instance, Shehu and Nilsson, 2014). If using SWTSs or similar surveys, information on labour market history could be added to the model, such as the number and length of past employment spells, as well as the length of the transition from school to the labour market (ibid.).

Determinants of youth well-being deficits in education and skills

Table 4.3 below presents two models to analyse specific deficits in the area of education and skills. The first model aims to determine the drivers of poor learning achievements and provides an example of how one can analyse the risk factors leading to poor learning achievement.1 Poor learning achievements are measured by negative results in cognitive skills assessments. Three types of cognitive skills are considered: i) literacy (reading and writing); ii) numeracy; and iii) general cognitive skills based on a receptive vocabulary test. The second model focuses on the determinants of school drop-out.

Table 4.3. Determinants of youth disadvantage: education and skills

Well-being deficits

Poor learning achievements

School drop-out

Type of data needed

Surveys that include information on youth learning achievements and cognitive skills, such as PISA, Young Lives longitudinal study of child poverty, Trends in International Mathematics and Science Study (TIMSS), Program on the Analysis of Education Systems (PASEC), Southern and Eastern Africa Consortium for Monitoring Educational Quality (SACMEQ) and Latin American Laboratory for Assessment of the Quality of Education (LLECE)

Household surveys, labour force surveys; information on school drop-out can also be found in national population censuses.

Estimation model

Logit or probit model

Logit or probit model

Dependent variable

Dummy variable taking value 1 for low performers in cognitive skills assessments and 0 for successful performers

Three types of cognitive skills considered: i) literacy (reading and writing), ii) numeracy, iii) general cognitive skills based on a receptive vocabulary test

Dummy variable taking value 1 for youth who dropped out of school and 0 for youth who are attending school

Independent variables

Individual characteristics: age, gender, ethnicity or mother tongue, malnutrition, education (school attendance, as well as years or level of education); poor non-cognitive skills, such as low self-esteem; prior cognitive skills (prior learning achievements in literacy, numeracy and general cognitive skills based on tests scores or results)

Family characteristics: income poverty, housing quality, access to services, consumer durables, parental/caregiver education or literacy

Community characteristics: location, urban/rural area of residence, access to/quality of schools, social norms

Individual characteristics: age; gender; ethnicity/race; disability and mental health; education (grade repetition, low academic performance); child labour; substance abuse (tobacco, alcohol, illicit drugs); early parenthood

Family characteristics: income poverty, parental education and employment situation, lack of parental support, family structure (presence of other children)

Community characteristics: location, urban/rural area of residence, access to/quality of schools, social norms

Model 1: Determinants of poor learning achievements

For this model, data from the Young Lives longitudinal study of child poverty can be used. This study covers four countries, namely Ethiopia, India, Peru and Viet Nam. Two age cohorts of children, born in 1994-95 and 2000-01 respectively, are traced in three successive rounds conducted in 2002, 2006-07 and 2009.2

The sample is restricted to the first cohort of the Young Lives study. Children are thus observed at ages 7/8 in 2002, 11/12 in 2006-07 and 14/15 in 2009. Poor learning achievements in literacy, numeracy and general cognitive skills are analysed separately for each of these ages whenever possible. The analysis is performed at different ages in order to determine how differently risk factors – in particular, past conditions and outcomes, such as prior cognitive and non-cognitive skills – can influence poor learning achievements over the childhood phase.


Poor learning achievement in literacy refers to children who cannot read or write, or who can read or write but with difficulties (read only letters or words, or write with difficulties or error). Information on literacy is only available in the first two survey rounds of the Young Lives study. Accordingly, poor learning achievements in literacy can only be analysed at ages 7/8 and 11/12.


As regards numeracy, surveyed children are asked to perform a test whose level of difficulty varies by age. At ages 7/8, the test consists of solving a simple arithmetic calculation. Older children have to perform more complex mathematics tests, consisting of 10 questions at ages 11/12 and 30 questions at ages 14/15. Scores are derived based on the results. Those with scores below the average are considered low performers. Poor learning achievement in numeracy/mathematics concerns those who did not solve the simple calculation in the case of younger children, and the low performers in the mathematics test in the case of older children.

General cognitive skills.

General cognitive skills are assessed through a Peacebody Picture Vocabulary Test (PPVT), a test of receptive vocabulary commonly used to obtain a proxy measure for verbal ability or intelligence, or scholastic aptitude (Rolleston and James, 2011). In this test, a series of pictures are presented to children. Examiners speak a word describing one of the pictures and ask children to point to or say the number of the picture that the word describes. As for numeracy/mathematics, low performers in general cognitive skills are defined as those having a PPVT score below the average. The PPVT is conducted only in the last two survey rounds of the Young Lives study; therefore, this analysis can only be performed for children ages 11/12 and 14/15.

In total, seven models are regressed to analyse the determinants of, respectively, illiteracy at ages 7/8 and 11/12; poor numeracy skills at ages 7/8, 11/12 and 14/15; and poor general cognitive skills at ages 11/12 and 14/15. Table 4.3 presents some of the factors affecting learning achievements (see, for instance, Rolleston and James, 2011). To allow assessment of the impact of past conditions on poor learning achievements, variables for malnutrition, education, low self-esteem and household wealth from previous survey rounds can be also introduced into the models. Malnutrition can be proxied by a dummy variable for underweight (weight-for-age less than -2 standard deviations of the WHO Child Growth Standards median) and low self-esteem by a dummy variable for children who are ashamed of their achievements at school or of their clothes (Rolleston and James, 2011).

In Young Lives, household wealth is a composite indicator based on three sub-indexes relating to housing quality, access to services and consumer durables. The housing quality component includes information on crowding and on the main material of walls, roof and floor. The sub-index on access to services takes into account electricity, safe drinking water, sanitation and adequate fuels for cooking. The consumer durables component contains indicators on whether the household owns a certain number of goods, such as a radio, television, bicycle, motorbike, automobile, landline phone, mobile phone or refrigerator.

Using a similar model including the factors reported above, Rolleston and James find that basic cognitive skills development in early childhood is closely related to household wealth and caregiver literacy at the family level and school attendance at the individual level (2011). In middle childhood, prior cognitive skills performance and schooling appear to play a central role but not household factors. In adolescence, household wealth and caregiver literacy again become important predictors, in addition to prior abilities and education.

Malnutrition in childhood is also consistently associated with reduced cognitive development (Hardgrove et al., 2014). On the other hand, greater parental interest in children’s education tends to increase their cognitive development, which, in turn, reduces the intergenerational transmission of poverty (Blanden, 2006). Other factors are very likely to impact children’s learning achievements, such as health issues and school quality (Aturupane, Glewwe and Wisniewski, 2010), and could be added to the models.

Model 2: Determinants of school drop-out

The analysis of school drop-out must be restricted to youth belonging to the official school-age population. School drop-out refers to youth who are currently out of school but were previously enrolled and left education before completion. If the information available is limited to school attendance at the time of the survey, school drop-out can be proxied by out-of-school youth.

The main risk factors leading to school drop-out are displayed in Table 4.3. Early parenthood, child labour and poverty are likely to be endogenous, i.e. correlated with unobserved factors that also affect the probability of dropping out of school. This can potentially bias the results of the model. A way to deal with this problem is to follow the instrumental variable approach. The literature proposes several instruments for these potential endogenous variables. In Cardoso and Verner, for example, early parenthood is instrumented with the declared ideal age to start sexual activity, child labour with the declared reservation wage, and poverty with self-reported hunger (2006).

It is important to account for child labour when the analysis is performed for developing countries, where child labour is frequent and constitutes one of the main obstacles to education, in particular to children’s persistence in the school system (Guarcello, Lyon and Rosati, 2008). Dropping out of school is also positively associated with early parenthood and poverty (Cardoso and Verner, 2006), as well as with educational factors, such as grade repetition and completion, poor learning achievements and low school quality (Branson, Hofmeyr and Lam, 2013; Hanushek, Lavy and Hitomi, 2006). Poor parental education and occupation, and the presence of siblings, in particular younger ones, are some of the other factors increasing the likelihood of school drop-out (Huisman and Smits, 2014).

Determinants of youth well-being deficits in health

In the health dimension, two models are proposed to analyse the determinants of adolescent pregnancy and substance abuse, respectively (Table 4.4).

Table 4.4. Determinants of youth disadvantage: health

Well-being deficits

Adolescent pregnancy

Substance abuse

Type of data needed

Household surveys, including DHS and CDC-assisted reproductive health surveys, or population censuses

Information on substance use is available mainly in specialised surveys, such as the GSHS and the HBSC survey

Estimation model

Logit or probit model

Logit or probit model

Dependent variable

Dummy variable for adolescent pregnancy or adolescent mothers (adolescent girls being defined as aged 15-19)

Dummy variable for adolescents who are current consumers of substances

Two different types of substances considered: alcohol and illicit drugs

Independent variables

Individual characteristics: ethnicity/race, religion, substance abuse (tobacco, alcohol, illicit drugs), mental health (depression), early sexual initiation and unprotected sex (condom and contraceptive use)

Family characteristics: Income poverty, low parental education and socioeconomic status, lack of parental support/connectedness, single parenthood, domestic abuse and violence, family structure (presence of older sexually active siblings or pregnant/parenting teenaged sisters)

Community characteristics: unsafe/poor neighbourhoods, negative peer influences (high risk peer associations), social norms (parental and adolescent sexual attitudes and values)

Individual characteristics: age, gender, ethnicity/race, school drop-out, mental health, risky behaviours (criminal/violent behaviours)

Family characteristics: domestic abuse/violence, parental substance abuse, lack of parental support/connectedness, lack of parental control (supervision, monitoring)

Community characteristics: unsafe/poor neighbourhoods; negative peer influences (crime/violence); social norms (social acceptance of drug use)

Model 1: Determinants of adolescent pregnancy

Factors included in Table 4.4 are often found to be positively correlated with adolescent pregnancy or motherhood and could therefore be included in the model. According to the literature, adolescent pregnancy is positively associated with, among other factors, lack of parental support and connectedness, living with a single parent, residing in unsafe neighbourhoods, having older sexually active siblings or other pregnant or parenting teenaged sisters, and sexual abuse (Miller, Benson and Galbraith et al. 2001). Besides lack of parental support/connectedness, lack of parental regulation (supervision and monitoring) and poor parent/teen communication about sexual issues also increase the probability of adolescent pregnancy.

Model 2: Determinants of substance use

In the GSHS, current alcohol consumers are defined as those who consumed one or more alcoholic drinks in the past 30 days and current drug consumers as those who used illicit drugs at least once in the past 30 days. Illicit drugs include marijuana, amphetamines or methamphetamines, cocaine, solvents or inhalants, ecstasy, heroin and other country specific drugs. The GSHS concerns students aged 13-17.

Table 4.4 lists the main risk factors associated with substance abuse. Parental substance abuse in particular can lead to impaired parental control and child maltreatment and, subsequently, the consumption of substances by the child (Singh, Thornton and Tonmyr, 2011). Substance abuse contributes to risks during adolescence for injury, violence, poor cognitive skills development, unprotected sex and suicide attempts. In adulthood, it plays a role in risks for non-communicable diseases and injuries (WHO, 2014).

Determinants of youth well-being deficits in civic participation and empowerment

As regards civic participation and empowerment, the analysis of determinants will focus first on the lack of civic engagement among youth and then on youth violence (Table 4.5).

Table 4.5. Determinants of youth disadvantage: civic participation and empowerment

Well-being deficits

Civic engagement

Youth violence

Type of data needed

Socioeconomic surveys, such as the Gallup World Poll surveys, contain several questions on civic engagement and thus can be used to perform this analysis.

Youth violence can be analysed using international specialised surveys, such as the GSHS, covering adolescents aged 13-17, and the HBSC survey, covering adolescents ages 11, 13 and 15

Estimation model

Logit or probit model

Logit or probit model

Dependent variable

Dummy variable taking value 1 for youth not civically engaged and 0 for youth exhibiting some form of civic engagement

Dummy variable taking value 1 for violent youth and 0 for non-violent youth

Independent variables

Individual characteristics: age, gender, ethnicity/race, migration, education

Family characteristics: income poverty, poor parental education, lack of parental support/connectedness, lack of parental control (supervision, monitoring), parents’ civic and political engagement, single parenthood

Community characteristics: location, urban/rural area of residence, unsafe/poor neighbourhoods, low access/quality of schools; low school connectedness, lack of infrastructure and public services, low social capital and social norms (interpersonal trust, social networks/connectedness, community values and attitudes – civic, social, religious)

Individual characteristics: age, gender, race/ethnicity, migration, mental disorders, low cognitive and non-cognitive skills (low self-esteem, emotional distress), education (poor academic performance, school drop-out), risky behaviours (substance abuse)

Family characteristics: income poverty, poor parental education, lack of parental support/connectedness, lack of parental control (supervision, monitoring), domestic abuse and violence, parental behaviours and attitudes (crime/violence, substance use), family structure (single parenthood)

Community characteristics: location, urban/rural area of residence, unsafe/poor neighbourhoods, lack of infrastructure and public services, access to/quality of schools, low school connectedness, negative peer influences (peer delinquency), social capital (interpersonal trust, social networks/connectedness)

Model 1: Determinants of the lack of civic engagement

Civic engagement can be defined using three questions from the Gallup World Poll surveys. Individuals are asked whether, in the past month, they have i) volunteered their time to an organisation; ii) donated money to a charity; or iii) helped a stranger or someone they did not know who needed help. In this model, youth who answered no to all three are considered not civically engaged; those who answered yes to at least one question are considered civically engaged (to some extent).

Table 4.5 presents some factors that could be integrated into the model, since they are very likely to affect youth civic engagement. As regards personal attributes, youth civic engagement can be found deeply rooted in the process of identity formation during adolescence. The psychological literature shows that adolescents who are self-reflective and who actively explore identity alternatives before making decisions about the values, beliefs and goals that they will pursue have more non-cognitive skills and positive personality traits, develop a higher sense of social responsibility and are more engaged in communities (Crocetti, Erentaitė. and Žukauskienė., 2014; Crocetti, Jahromi and Meeus, 2012). This is in contrast to adolescents who more automatically conform to others’ prescriptions and expectations (parents or other reference groups) or who do not deal with identity issues.

At the family level, affluence is shown to be positively related to youth participation in community activities (Lenzi et al., 2012). Empirical studies focussing on the role of parents demonstrate, in addition, that parental civic values and behaviours have an influence on youth civic engagement (Atkins and Hart, 2003). Moreover, lack of parental support and monitoring in childhood is shown to have an adverse impact on future youth civic engagement (ibid.).

Community characteristics also play a significant role in youth civic engagement. For instance, youth are less likely to be civically engaged when living in poor urban neighbourhoods (Atkins and Hart, 2003) or in communities with low levels of social capital (social ties and trust) and school connectedness (Lenzi et al., 2013, 2012).

Model 2: Determinants of youth violence

The last model proposed examines the factors correlated with youth involvement in violence. Using the GSHS, youth can be defined as violent if they reported any of the following behaviours: i) belonged to a violent group (attacks with insults, bullying, hits, assault, robbery, or rape); ii) were repeatedly in a physical fight in the past 12 months; or iii) have carried a weapon (gun, knife, club or other) on several occasions in the past 30 days.

Characteristics displayed in Table 4.5 have been identified by the literature as contributing factors to violent youth behaviours (O’Brien et al., 2013; Burfeind and Batusch, 2016), and hence could be taken into account. Risk for violence among youth tends to increase in particular with mental disorders (e.g. attention-deficit hyperactivity disorder symptoms), poor non-cognitive skills (e.g. emotional distress), low school connectedness, poor academic performance and educational aspirations, and high peer delinquency (Bernat et al., 2012). Substance use is another significant risk factor leading to juvenile delinquency (Simoes, Matos and Batista-Foguet, 2008).

Other proximal determinants of youth violence can be found at the family level. For instance, youth living in broken homes and who lack parental support and control are more susceptible to delinquent peer influences, thereby increasing opportunities for involvement in delinquency (Burfeind and Bartusch, 2016; Alboukordi et al., 2012).

Table 4.6. Examples of risk factors to youth well-being at the individual, family and community levels

Individual level

Risk factor

Suggested indicator

Data sources

Age group

Dummy variable for adolescents aged 10-19 or young adults aged 20-29 or continuous variable for age (in quadratic terms)

Household surveys, labour force surveys, etc.


Dummy variable for being female

Household surveys, labour force surveys, etc.


Dummy variable for minority status

Household surveys, labour force surveys, etc. (unavailable in some countries)


Dummy variable for youth aged 15-24 or 29 who have a lot of difficulties or who cannot do at all at least one of the following activities: i) seeing (even if wearing glasses); ii) hearing (even if using a hearing aid); iii) walking/climbing steps; iv) remembering/concentrating; v) self-care; vi) communicating

SWTS, other household surveys with information on disability status

Mental health / low self-esteem

Dummy variable for youth aged 13-17 who have seriously considered attempting suicide in the past 12 months


Poor nutrition

Dummy variable for youth aged 13-17 considered underweight, i.e. body mass index (BMI) less than two standard deviations from the median for the corresponding age and gender. The BMI is calculated by dividing their weight in kg by the square of their height in m.


Early sexual activity

Age at first sexual intercourse


Risky sexual behaviour: condom use at the last high-risk sexual intercourse

Dummy variable for young men and women aged 15-24 reporting the use of a condom during sexual intercourse with a non-cohabiting, non-marital sexual partner in the last 12 months

United Nations Millennium Development Goals Indicators (Goal 6, Target 6.A, Indicator 6.2); household surveys (multiple indicator cluster surveys; DHS, etc.); reproductive and health surveys; STEPwise approach to surveillance surveys

Risky sexual behaviour: contraception prevalence among adolescents

Dummy variable for adolescent single or married girls aged 15-19 who are currently using, or whose sexual partner is using, at least one method of contraception, regardless of the method

WHO – Adolescent birth rate: Data by country, household surveys, DHS


Dummy variable for adolescent girls under age 18 who have given birth or who are currently pregnant


Tobacco use

Dummy variable for youth aged 13-15 who are current users of any tobacco product (those who consumed any smokeless or smoking tobacco product at least once in the past 30 days)

WHO – Global youth tobacco survey data

Alcohol use

Dummy variable for youth aged 13-17 who had at least one drink containing alcohol in the past 30 days


Illicit drug use

Dummy variable for youth aged 13-17 who have used illicit drugs (marijuana, amphetamines or methamphetamines, cocaine, solvents or inhalants, ecstasy, heroin) at least once in the past 30 days


School absenteeism

Dummy variable for youth aged 13-17 who missed at least one day of classes or school without permission in the past 30 days


Grade repetition

Dummy variable for students enrolled in the same grade as the previous year in i) primary education (all grades); ii) secondary general education (all grades)

UNESCO Institute for Statistics database; school register, school census or surveys for data on repeaters and enrolment by grade

School drop-out

Dummy variable for students who dropped out of i) primary education; ii) lower secondary general education

UNESCO Institute for Statistics database; national population census; household surveys, labour force surveys

Poor academic performance

Continuous variables with performance scores in reading, mathematics and science among students age 15 or dummy variables for low performers

OECD – PISA surveys; other international surveys: TIMSS, PASEC, SCAMEQ, LLECE


Dummy variable for youth aged 15-24 or 29 neither in employment nor in education or training

SWTS, labour force surveys, population census and/or other household surveys with an appropriate employment module

Child labour

Dummy variable for i) children aged 10-11 in any type of economic activity; ii) children aged 12-14 in economic activity, excluding light work; iii) children aged 15-17 in hazardous or worst forms of work

ILO child labour surveys, UNICEF Multiple Indicator Cluster Surveys, labour force surveys, other household surveys with a child labour module

Violent/criminal behaviours

Dummy variable for youth aged 13-17 who were in a physical fight (e.g. one or more person(s) hit or struck someone, or one or more person(s) hurt someone with a weapon, such as a stick, knife or gun) at least once in the past 12 months

Dummy variable for youth aged 13-17 who carried a weapon (e.g. gun, knife, club) at least once in the past 30 days


Belonging to a violent/criminal group

Dummy variable for youth aged 13-17 who belong to any violent group (group of people who attack other people or groups with insults, bullying, hits, assault, robbery or rape)


Family/Household level

Risk factor

Suggested indicator

Data sources

Lack of parental support

Dummy variable for youth aged 13-17 who reported that their parents never or rarely supported and encouraged them in the past 30 days


Domestic abuse/violence

Dummy variable for young women aged 15-24 or 29 who have experienced physical and/or sexual violence by an intimate partner or family member ever or in the past 12 months


Community level

Risk factor

Suggested indicator

Data sources

Violent neighbourhoods

Homicide and assault rates at sub-national levels (available in some countries)

United Nations Office on Drugs and Crime (UNODC) – Homicide statistics; UNODC – Statistics on crime (only available for the total population); United Nations Survey of Crime Trends and Operations of Criminal Justice Systems

Unsafe neighbourhoods

Percentage of people who feel unsafe walking alone at night in their city, area or community

Gallup World Poll surveys

Negative peer influences

Interpersonal trust

  • Dummy variable for youth (up to 29) who believe that they need to be very careful in dealing with people.

  • Dummy variable for youth (up to 29) who believe that most people try to take advantage of them if they got a chance (scored less than 5 on a 1 to 10 scale).

World Values Survey – Online Data Analysis (WVS wave 6)


Alboukordi, S. et al. (2012), “Predictive Factors for Juvenile Delinquency: The Role of Family Structure, Parental Monitoring and Delinquent Peers”, International Journal of Criminology and Sociological Theory, Vol. 5/1, Lifescience Global, Dubai, pp. 770-778.

Atkins, R. and D. Hart (2003), “Neighborhoods, Adults, and the Development of Civic Identity in Urban Youth”, Applied Developmental Science, Vol. 7/3, Taylor & Francis, Abingdon, pp. 156-164.

Aturupane, H., P. Glewwe and S. Wisniewski (2013), “The Impact of School Quality, Socioeconomic Factors, and Child Health on Students’ Academic Performance: Evidence from Sri Lankan Primary Schools”, Education Economics, Vol. 21/1, Taylor & Francis, Abingdon, pp. 2-37.

Bernat, D.H. et al. (2012), “Risk and Direct Protective Factors for Youth Violence: Results from the National Longitudinal Study of Adolescent Health”, American Journal of Preventive Medicine, Vol. 43/2, Elsevier Inc. on behalf of the American College of Preventive Medicine and the Association for Prevention Teaching and Research, Washington, DC, pp. S57-S66.

Bills, D.B. et al. (2015),“Gender, Regional Socioeconomic Development, and the School to Work Transition of Young Brazilians”, conference paper, International Labour Organization Work4Youth Global Research Symposium, Geneva, 3-4 March 2015.

Blanden, J. (2006), “‘Bucking the Trend’: What Enables Those Who Are Disadvantaged in Childhood to Succeed Later in Life?”, a report of research carried out by the Department of Economics, University of Surrey and the Centre for Economic Performance, London School of Economics on behalf of the Department for Work and Pensions, Working Paper, No. 31, published for the Department for Work and Pensions by Corporate Document Services, London.

Branson, N., C. Hofmeyr and D. Lam (2014), “Progress Through School and the Determinants of School Dropout in South Africa”, Development Southern Africa, Vol. 31/1, Taylor & Francis, Abingdon, pp. 106-126.

Burfeind, J.W. and D.J. Bartusch (2016), Juvenile Delinquency: An Integrated Approach, Routledge, Abingdon.

Cardoso, A.R. and D. Verner (2006), “School Drop-out and Push-out Factors in Brazil: The Role of Early Parenthood, Child Labor, and Poverty”, World Bank Policy Research Working Paper, No. 4178, World Bank, Washington, DC.

Crocetti, E, R. Erentaitė. and R. Žukauskienė. (2014), “Identity Styles, Positive Youth Development, and Civic Engagement in Adolescence”, Journal of Youth and Adolescence, Vol. 43/11, Springer Publishing, New York, pp. 1818 1828.

Crocetti, E., P. Jahromi and W. Meeus (2012), “Identity and Civic Engagement in Adolescence”, Journal of Adolescence, Vol. 35/3, Elsevier Ltd., Amsterdam, pp. 521-532.

Elder, S. and K.S. Koné (2014), Labour Market Transitions of Young Women and Men in Sub-Saharan Africa, Work4Youth Publication Series, No. 9, International Labour Organization, Geneva.

Eurofound (2012), NEETs – Young People not in Employment, Education or Training: Characteristics, Costs and Policy Responses in Europe, European Foundation for the Improvement of Living and Working Conditions, Publications Office of the European Union, Brussels.

Guarcello, L., S. Lyon and F.C. Rosati (2008), “Child Labour and Education for All: An Issue Paper”, Understanding Children’s Work Project Working Paper Series, Rome.

Hanushek, E.A., W. Lavy and K. Hitomi (2006), “Do Students Care about School Quality? Determinants of Dropout Behavior in Developing Countries”, National Bureau of Economic Research Working Paper, No. w12737, Journal of Human Capital, Vol. 2/1, University of Chicago Press, pp. 69-105.

Hardgrove, A. et al. (2014), “Youth Vulnerabilities in Life Course Transitions”, United Nations Development Programme (UNDP) Occasional Paper, UNDP, New York.

Huisman, J. and J. Smits (2014), “Keeping Children in School: Effects of Household and Context Characteristics on School Dropout in 363 Districts of 30 Developing Countries”, Nijmegen Center for Economics (NiCE) Working Paper, No. 09-105, Radboud University Nijmegen, The Netherlands.

J-PAL (2012), “Deworming: A Best Buy for Development”,

ILO (2015), “What Does NEETs Mean and Why Is the Concept so Easily Misinterpreted?”, Technical Brief Note, No.1, International Labour Organization, Geneva.

ILO (2013a), Global Employment Trends for Youth 2013: A Generation at Risk, International Labour Organization, Geneva.

ILO (2013b), Measuring Informality: A Statistical Manual on the Informal Sector and Informal Employment, International Labour Organization, Geneva.

Lenzi, M. et al. (2013), “Neighborhood Social Connectedness and Adolescent Civic Engagement: An Integrative Model”, Journal of Environmental Psychology, Vol. 34, Elsevier Ltd., Amsterdam, pp. 45-54.

Lenzi, M. et al. (2012), “Family Affluence, School and Neighborhood Contexts and Adolescents’ Civic Engagement: A Cross-National Study”, American Journal of Community Psychology, Vol. 50/1-2, Springer, New York, pp. 197-210.

Miller, B.C., B. Benson and K.A. Galbraith (2001), “Family Relationships and Adolescent Pregnancy Risk: A Research Synthesis”, Developmental Review, Vol. 21/1, Elsevier B.V., Amsterdam, pp. 1-38.

O’Brien, K. et al. (2013), “Youth Gang Affiliation, Violence, and Criminal Activities: A Review of Motivational, Risk, and Protective Factors”, Aggression and Violent Behavior, Vol. 18/4, Elsevier Ltd., Amsterdam, pp. 417-425.

OECD (2013), “Social Policies for Youth: Bridging the Gap to Independence”, Scoping Paper, Paris.

Rolleston, C. and Z. James (2011), “The Role of Schooling in Skill Development: Evidence from Young Lives in Ethiopia, India, Peru and Vietnam”, Background Paper commissioned for the Education for All Global Monitoring Report 2012, United Nations Educational, Scientific and Cultural Organization, Paris.

Rosati, F.C., M. Ranzani and M. Manacorda (2015), “Pathways to Work in the Developing World: An Analysis of Young Persons’ Transition from School to the Workplace”, Understanding Children’s Work Programme, Working Paper, Rome.

Shehu, E. and B. Nilsson (2014), Informal Employment Among Youth: Evidence from 20 School-to-Work Transition Surveys, Work for Youth Publication Series, No. 8, International Labour Organization, Geneva.

Simoes, C., M.G. Matos and J.M. Batista-Foguet (2008), “Juvenile Delinquency: Analysis of Risk and Protective Factors using Quantitative and Qualitative Methods”, Cognition, Brain, Behavior (CBB). An Interdisciplinary Journal, Vol. 7/4, CBB, Editura ASCR Romania, Cluj-Napoca, pp. 389-408.

Singh, V.A.S., T. Thornton and L. Tonmyr (2011), “Determinants of Substance Abuse in a Population of Children and Adolescents Involved with the Child Welfare System”, International Journal of Mental Health and Addiction, Vol. 9/4, Springer Publishing, New York, pp. 382-397.

UNESCO (2012), Education for All Global Monitoring Report 2012 – Youth and Skills: Putting Education to Work, United Nations Educational, Scientific and Cultural Organization, Paris.

WHO (2014), Health for the World’s Adolescents: A Second Chance in the Second Decade, World Health Organization, Geneva.

Wooldridge, J.M. (2002), Econometric Analysis of Cross Section and Panel Data, Massachusetts Institute of Technology Press, Cambridge, MA.


← 1. This model is based on Rolleston and James (2011).

← 2. The 2013 survey round is not yet publicly available.