# Annex B. Technical notes on analyses in this volume

**A note regarding Israel**

The statistical data for Israel are supplied by and under the responsibility of the relevant Israeli authorities. The use of such data by the OECD is without prejudice to the status of the Golan Heights, East Jerusalem and Israeli settlements in the West Bank under the terms of international law.

## Use of teacher and school weights

The statistics presented in this report were derived from data obtained through samples of schools, school principals and teachers. The sample was collected following a stratified two-stage probability sampling design. This means that teachers (second-stage units or secondary sampling units) were randomly selected from the list of in-scope teachers for each of the randomly selected schools (first-stage or primary sampling units). For these statistics to be meaningful for a country, they needed to reflect the whole population from which they were drawn and not merely the sample used to collect them. Thus, survey weights must be used in order to obtain design-unbiased estimates of population or model parameters.

Final weights allow the production of country-level estimates from the observed sample data. The estimation weight indicates how many population units are represented by a sampled unit. The final weight is the combination of many factors reflecting the probabilities of selection at the various stages of sampling and the response obtained at each stage. Other factors may also come into play as dictated by special conditions to maintain the unbiasedness of the estimates (e.g. adjustment for teachers working in more than one school).

Statistics presented in this report that are based on the responses of school principals and that contribute to estimates related to school principals were estimated using school weights (SCHWGT). Results based only on responses of teachers or on responses of teachers and principals (i.e. responses from school principals were merged with teachers’ responses) were weighted by teacher weights (TCHWGT).

## Use of complex variables and scales

### Scales

In this report, several scale indices are used in regression analyses. Descriptions of the construction and validation of these scales can be found in Chapter 11 of the *TALIS 2018 Technical Report* (OECD, 2019_{[1]}).

### Ratios and other variables derived from TALIS data

**Student-teacher ratio –** The student-teacher ratio was derived from school principals’ responses to a question about the number of staff (head counts) currently working in the school and the total number of students (head counts) of all grades in the school. Therefore, measure is not restricted to those teaching or supporting ISCED level 2 education in the school but covers education at all levels provided in the school. The ratio is derived by dividing the number of students by the number of teachers (those whose main activity is the provision of instruction to students). The analyses reporting this ratio in Chapter 3 were done at the school level and, therefore, used the final school estimation weight (SCHWGT).

**Ratio of teachers to number of personnel for pedagogical support** – This ratio was derived from school principals’ responses to a question about the number of staff (head counts) currently working in the whole school and is, therefore, not restricted only to those teaching or supporting ISCED level 2 education in the school. The ratio is derived by dividing the number of teachers (those whose main activity is the provision of instruction to students) by the number of personnel for pedagogical support (including all teacher aides or other non-teaching professionals who provide instruction or support teachers). The analyses reporting this ratio in Chapter 3 were done at the school level and, therefore, used the final school estimation weight (SCHWGT). In line with the approach taken in TALIS 2013, for those few observations where the number of personnel for pedagogical support is zero, the ratio of teachers to number of personnel for pedagogical support is set to equal the number of teachers.

**Ratio of teachers to number of school administrative or management personnel** – This ratio was derived from school principals’ responses to a question about the number of staff (head counts) currently working in the school. Therefore, the measure is not restricted to those teaching or supporting ISCED level 2 education in the school but covers education at all levels provided in the school. The ratio is derived by dividing the number of teachers (those whose main activity is the provision of instruction to students) by the sum of school administrative personnel and management personnel. School administrative personnel include receptionists, secretaries and administration assistants, while management personnel include principals, assistant principals, and other management staff whose main activity is management. The analyses reporting this ratio were done at the school level and, therefore, used the final school estimation weight (SCHWGT).

**Practices pertaining to clarity of instruction** – This variable was derived from teachers’ responses to a question about the frequency of use of certain teaching practices in the target class. The variable is constructed as a binary variable based on the arithmetic mean of three teaching practices: “I present a summary of recently learned content”, “I refer to a problem from everyday life or work to demonstrate why new knowledge is useful” and “I let students practise similar tasks until I know that every student has understood the subject matter”. The variable takes the value of 0 if the arithmetic mean of the aforementioned three teaching practices is lower than 0.5, while it is equal to 1 if the arithmetic mean is greater than 0.5. In those few cases where the arithmetic mean is equal to 0.5, the variable is set to missing.

## International averages

The OECD and TALIS averages, which were calculated for most indicators presented in this report, correspond to the arithmetic mean of the respective country estimates. When the statistics are based on responses of teachers, the OECD and TALIS averages cover, respectively, 31 and 48 countries and economies (see Table AI.B.1). In those cases where the analysis is based on principals’ responses, the OECD and TALIS averages cover, respectively, 30 and 47 countries and economies.

The EU total represents the 23 European Union member states that also participated in TALIS 2018 as a single entity and to which each of the 23 EU member states contribute in proportion to the number of teachers or principals, depending on the basis of the analysis. Therefore, the EU total is calculated as a weighted arithmetic mean based on the sum of final teacher (TCHWGT) or principal (SCHWGT) weights by country, depending on the target population.

In this publication, the OECD average is generally used when the focus is on providing a global tendency for an indicator and comparing its values across education systems. In the case of some countries and economies, data may not be available for specific indicators, or specific categories may not apply. Therefore, readers should keep in mind that the term “OECD average” refers to the OECD countries and economies included in the respective comparisons. In cases where data are not available or do not apply to all sub-categories of a given population or indicator, the “OECD average” may be consistent within each column of a table but not necessarily across all columns of a table.

## Standard errors and significance tests

The statistics in this report represent estimates based on samples of teachers and principals, rather than values that could be calculated if every teacher and principal in every country had answered every question. Consequently, it is important to measure the degree of uncertainty of the estimates. In TALIS, each estimate has an associated degree of uncertainty that is expressed through a standard error. The use of confidence intervals provides a way to make inferences about the population means and proportions in a manner that reflects the uncertainty associated with the sample estimates. From an observed sample statistic and assuming a normal distribution, it can be inferred that the corresponding population result would lie within the confidence interval in 95 out of 100 replications of the measurement on different samples drawn from the same population. The reported standard errors were computed with a balanced repeated replication (BRR) methodology.

### Differences between sub-groups

Differences between sub-groups along teacher (e.g. female teachers and male teachers) and school characteristics (e.g. schools with a high concentration of students from socio-economically disadvantaged homes and schools with a low concentration of students from socio-economically disadvantaged homes) were tested for statistical significance. All differences marked in bold in the data tables of this report are statistically significantly different from 0 at the 95% level.

In the case of differences between sub-groups, the standard error is calculated by taking into account that the two sub-samples are not independent. As a result, the expected value of the covariance might differ from 0, leading to smaller estimates of standard error as compared to estimates of standard error calculated for the difference between independent sub-samples.

### Differences between cycles

Differences between TALIS cycles (e.g. change between 2013 and 2018) were tested for statistical significance. All differences marked in bold in the data tables of this report are statistically significant at the 95% level. As samples from different TALIS cycles are considered independent, the standard error for any comparison between cycles is calculated with the expected value of the covariance being equal to 0.

## Statistics based on regressions

Regression analysis was conducted to explore the relationships between different variables. Multiple linear regression was used in those cases where the dependent (or outcome) variable was considered continuous. Binary logistic regression was employed when the dependent (or outcome) variable was a binary categorical variable. Regression analyses were carried out for each country separately. Similarly to other statistics presented in this report, the OECD and TALIS averages refer to the arithmetic mean of country level estimates, while the EU total is calculated as a weighted arithmetic mean based on the sum of final teacher (TCHWGT) or principal (SCHWGT) weights by country, depending on the target population.

Control variables included in a regression model are selected based on theoretical reasoning and, preferably, limited to the most objective measures or those that do not change over time. Controls for teacher characteristics include: teacher’s gender, age, employment status (i.e. full-time/part-time) and years of teaching experience. Controls for class characteristics include: variables of classroom composition (i.e. share of students whose first language is different from the language of instruction, low academic achievers, students with special needs, students with behavioural problems, students from socio-economically disadvantaged homes, academically gifted students, immigrant students or students with an immigrant background, refugee students) and class size.

In the case of regression models based on multiple linear regression, the explanatory power of the regression models are also highlighted by reporting the R-squared (R^{2}), which represents the proportion of the observed variation in the dependent (or outcome) variable that can be explained by the independent (or explanatory) variables.

In order to ensure the robustness of the regression models, independent variables were introduced into the models in steps. This approach also required that the models at each step be based on the same sample. The restricted sample used for the different versions of the same model corresponded to the sample of the most extended (i.e. with the maximum number of independent variables) version of the model. Thus, the restricted sample of each regression model excluded those observations where all independent variables had missing values.

### Multiple linear regression analysis

Multiple linear regression analysis provides insights into how the value of the continuous dependent (or outcome) variable changes when any one of the independent (or explanatory) variable varies while all other independent variables are held constant. In general, and with everything else held constant, a one-unit increase in the independent variable (*X _{i}*) increases, on average, the dependent variable (

*Y*) by the units represented by the regression coefficient (β

*):*

_{i}

When interpreting multiple regression coefficients, it is important to keep in mind that each coefficient is influenced by the other independent variables in a regression model. The influence depends on the extent to which independent variables are correlated. Therefore, each regression coefficient does not capture the total effect of independent variables on dependent variables. Rather, each coefficient represents the additional effect of adding that variable to the model, if the effects of all other variables in the model are already accounted for. It is also important to note that, because cross-sectional survey data were used in these analyses, no causal conclusions can be drawn.

Regression coefficients in bold in the data tables presenting the results of regression analysis are statistically significantly different from 0 at the 95% confidence level.

### Binary logistic regression analysis

Binary logistic regression analysis enables the estimation of the relationship between one or more independent (or explanatory) variables and the dependent (or outcome) variable with two categories. The regression coefficient (ß) of a logistic regression is the estimated increase in the log odds of the outcome per unit increase in the value of the predictor variable.

More formally, let *Y* be the binary outcome variable indicating no/yes with 0/1, and p be the probability of *Y* to be 1, so that p = prob (*Y*=1). Let *X _{1},… X_{k}* be a set of explanatory variables. Then, the logistic regression of

*Y*on

*X*estimates parameter values for ß

_{1},… X_{k}*via the maximum likelihood method of the following equation:*

_{0}, ß_{1},..., ß_{k}

Additionally, the exponential function of the regression coefficient (*exp (ß)*) is obtained, which is the odds ratio (OR) associated with a one-unit increase in the explanatory variable. Then, in terms of probabilities, the equation above is translated into the following:

The transformation of log odds (ß) into odds ratios (*exp (ß)*; OR) makes the data more interpretable in terms of probability. The odds ratio (OR) is a measure of the relative likelihood of a particular outcome across two groups. The odds ratio for observing the outcome when an antecedent is present is:

where *p _{11}/p_{12}* represents the “odds” of observing the outcome when the antecedent is present, and

*p*represents the “odds” of observing the outcome when the antecedent is not present. Thus, an odds ratio indicates the degree to which an explanatory variable is associated with a categorical outcome variable with two categories (e.g. yes/no) or more than two categories. An odds ratio below one denotes a negative association; an odds ratio above one indicates a positive association; and an odds ratio of one means that there is no association. For instance, if the association between being a female teacher and having chosen teaching as first choice as a career is being analysed, the following odds ratios would be interpreted as:

_{21}/p_{22}-
**0.2:**Female teachers are five times less likely to have chosen teaching as a first choice as a career than male teachers. -
**0.5:**Female teachers are half as likely to have chosen teaching as a first choice as a career than male teachers. -
**0.9:**Female teachers are 10% less likely to have chosen teaching as a first choice as a career than male teachers. -
**1:**Female and male teachers are equally likely to have chosen teaching as a first choice as a career. -
**1.1:**Female teachers are 10% more likely to have chosen teaching as a first choice as a career than male teachers. -
**2:**Female teachers are twice more likely to have chosen teaching as a first choice as a career than male teachers. -
**5:**Female teachers are five times more likely to have chosen teaching as a first choice as a career than male teachers.

The odds ratios in bold indicate that the relative risk/odds ratio is statistically significantly different from 1 at the 95% confidence level. To compute statistical significance around the value of 1 (the null hypothesis), the relative-risk/odds-ratio statistic is assumed to follow a log-normal distribution, rather than a normal distribution, under the null hypothesis.

## Pearson correlation coefficient

Correlation coefficient measures the strength and direction of the statistical association between two variables. Correlation coefficients vary between -1 and 1; values around 0 indicate a weak association, while the extreme values indicate the strongest possible negative or positive association. The Pearson correlation coefficient (indicated by the letter r) measures the strength and direction of the linear relationship between two variables.

In this report, Pearson correlation coefficients are used to quantify relationships between country-level statistics.

## Changes between TALIS cycles and implications for analyses

### Change in the definition of the target population between TALIS cycles

The third TALIS cycle (i.e. TALIS 2018) allows analysis of changes over a 10-year period. Nevertheless, such analysis poses particular challenges and, therefore, requires caution. The various challenges include: country coverage and the target population within a given country may differ across cycles; the variables of interest could change, in addition, through changes in the questionnaires; moreover, the context of teaching and learning might also change. Therefore, comparisons across cycles need to be interpreted with care.

In TALIS 2008, teachers whose teaching is directed entirely or mainly to students with special needs were not part of the target population. However, this changed for TALIS 2013 and 2018, as teachers of special needs students got included in the target population. Hence, estimates representing the change from 2008 to 2013 and from 2008 to 2018 need to interpreted with caution. Nevertheless, it is important to note that teachers who work in schools that teach only special needs students were excluded from all TALIS cycles.

In the case of New Zealand, the definition of the target population has changed between TALIS 2013 and TALIS 2018. While, in 2013, schools with four or fewer eligible teachers were excluded, it was no longer the case in 2018. As a result, a filter variable (TALIS13POP), which excludes schools with four or fewer teachers for New Zealand, was used to estimate 2018 statistics for New Zealand in order to ensure comparability in data tables representing changes over time. Therefore, these results can differ from those reported for the full TALIS 2018 sample of New Zealand, especially for those based on principals’ reports.

### Change in the ISCED classification

The classification of levels of education is based on the International Standard Classification of Education (ISCED). ISCED is an instrument for compiling statistics on education internationally. In TALIS 2008 and 2013, ISCED-97 was used to report on teachers’ and principals’ educational attainment. The first classification, ISCED-97, was revised and the new one, ISCED-2011, was formally adopted in November 2011. ISCED-2011 is the basis of the education levels presented in the TALIS 2018 questionnaires for teachers and for school principals. The data tables reporting teachers’ and principals’ educational attainment in this report are based on ISCED-2011. A correspondence table (Table AI.B.2) was used to translate ISCED-97 education categories used in TALIS 2008 and 2013 into the categories of the new ISCED-2011, in order to produce tables reporting changes in teachers’ and principals’ educational attainment from 2008 to 2018. This correspondence table was used to compile Tables I.4.11 and I.4.27 in Chapter 4 of Volume I. However, changes over time in teachers’ and principals’ educational attainment will need to be interpreted with caution because of the change in the classification.

For certain countries, the correspondence between ISCED-97 and ISCED-2011 was revised to reflect country specificities, compared to the general approach presented in Table AI.B.2. As a result, for Tables I.4.11 and I.4.27, ISCED-97 level 5B was reclassified as ISCED-2011 level 6 in the cases of Italy and the Flemish Community of Belgium.

In Austria, the former “Pädagogische Akadmie” (pedagogical academy, ISCED-97 level 5B) was transformed into “Pädagogische Hochschule” (university college of teacher education, ISCED-2011 level 6) in 2007. Thus, in the case of Austria, the large change from 2008 to 2018 in ISCED levels 5 and 6 in Tables I.4.11 and I.4.27 is not only caused by the change in ISCED classification, but it is also a result of the change in the system of teacher education.

In Portugal, the teachers with a “pre-Bologna master’s degree” are categorised as ISCED level 6. The question is presented in a way that prevents the disaggregation between “pre-Bologna master’s degree” and “doctorate degree”.

In Slovenia, teachers with a “pre-Bologna bachelor’s degree” are categorised as ISCED level 5 (which typically corresponds to short-term tertiary education). The question is presented in a way that prevents the disaggregation between “pre-Bologna bachelor’s degree” and “bachelor’s degree”.

## References

[1] **OECD** (2019), *TALIS 2018 Technical Report*, OECD, Paris.

[2] **UNESCO-UIS** (2012), *International Standard Classification of Education: ISCED 2011*, UNESCO Institute for Statistics, Montreal, http://uis. unesco.org/sites/default/files/documents/international-standard-classification-of-education-isced-2011-en.pdf.