Annex B. Composite indices of innovation
The analyses reported throughout this book have shown considerable variation in the amount of change in educational practices and thus the potential extent of innovation. In order to provide an overview of change across school and classroom practices and to draw some conclusions about the level of innovation in each country, it may be considered helpful to combine some of this information and look at the extent and focus of innovation within education in different countries.
There may be important differences between practices at different education levels (primary or secondary) or across disciplines. For this reason, broader composite indices have been created to group together practices and represent innovation at the discipline level- maths, science and reading and at the education level- primary and secondary, besides and index for overall educational innovation. Additionally, composite indices for ICT practices and more specific educational practices have been computed. This allows readers and policy makers to identify which aspects of countries’ education system(s) appear to have experienced relatively more innovation, and identifies countries that are innovating throughout the education system.
Creating the indices
The indices draw from the analysis reported in this book. The approach used is broadly based on the guidance provided in the 2018 OECD handbook on constructing composite indicators. In particular, the indices are derived (as far as possible) from the definition of innovation discussed in the introduction and the process of creating them takes into account the need for appropriate data and imputing missing values.
The indices are based on the effect sizes of changes in responses to specific questions between baseline and endline years. Effect sizes reflect the size and direction of changes seen across two points in time, with a large positive effect size indicating a large increase over time and a large negative effect size indicating a large decrease. Effect sizes give a standardised measure of the change and can thus be easily added together.
Education level, discipline level, and overall indices of innovation
These indices are constructed in order to represent change in practices across different grades, disciplines or throughout the whole education system. Given that both increases and decreases indicate change which can be part of innovation, the absolute value of the effect size has been used to create these indicators. An index that kept the sign of the effect size would make countries that have large changes in both directions appear to have no change at all.
In order to have a fair representation of innovation, different disciplines have been given different weights at different levels. Primary and secondary levels were given equal weights, whereas maths, science, and reading were given different weights defined on the basis of the relative instruction time spent on each one of the disciplines in every respective grade (source: Education at a Glance 2011) For instance, as reading instruction time is roughly twice as large as science instruction time in primary education, change in reading practices was given twice as much weight as change in science practices for this particular level.
ICT and thematic indices
These indices illustrate change in more specific educational practices. However, it is relevant in this case to not only analyse whether the use of certain practices has met significant change, but also whether the use has more often increased or decreased. Thus, besides the value of composite indices with absolute effect sizes, the graphs for ICT and thematic practices also demonstrate the decomposition of the change into increases and decreases.
The conceptual grouping of these indicators was done to maintain a more or less balanced representation of practices across both grades and across all the disciplines. This allowed us to go ahead with an unweighted average rather than weighting by grades or disciplines.
Missing values
Variation in the coverage of PISA and TIMSS/PIRLS means that school and classroom change effect sizes are therefore not available for all education systems across all of the questions asked. Furthermore, data are missing when certain questions (or questionnaires) were omitted at the national level at certain points in time. This is not an issue when reporting responses to a single question, but it does pose a potential problem when seeking to combine information across questions. In order to analyse as many countries as possible whilst keeping a wide range of questions in the analysis, it has been necessary to manage the missing data through a combination of deletion and estimations processes.
An iterative process has been used to manage observations (education systems) and variables (questions) with missing data, and some systems/countries and questions have had to be omitted in the construction of an index:
-
1. Education systems that had effect size data for fewer than 20% of the potential question set were excluded.
-
2. Following this, questions with high proportions of missing data were dropped. Specifically, those questions with effect size missing for more than 50% of the remaining database were excluded.
-
3. Education systems with less than 60% valid data on the remaining questions were then excluded from the analysis.
Following the deletion process, some of the remaining education systems still had portions of missing data. Data was typically missing when education system had not participated in one of the surveys. As information for a whole dataset was missing, it was not possible to undertake an imputation at the indicator level. However, it was possible to estimate the effect of a missing dataset on the final index.
The estimation process uses information from countries having all the data points in order to estimate the impact of including a dataset on the index computation. We use this information to adjust the indices of countries missing one dataset. The process goes as follows:
-
For education systems with all the information available, a subset of indices was computed, each one of them excluding one of the datasets from the index computation ( ). The index including all the data was also calculated (I). For instance if other countries missed PISA, countries with all the information available will have an index excluding PISA ( ) and one with PISA (I).
-
The ratios of complete index to sub-indices were calculated for each country ( .
-
The cross-country mean ratio of full index to every sub-index was computed, giving us a dataset factor effect for each potential missing data source. (
-
Finally, countries missing data from one source (A) had their index computed with all their information available ( ). This index is then corrected by multiplying it by the dataset factor of the corresponding missing database, giving us the final composite index ( ).
Criteria for including questions in the indices
Highly correlated questions may unduly influence an index that seeks to explore the extent to which change occurs over different aspects of education, particularly given the existence of missing data. For this reason, where question effect sizes are highly correlated [0.6 or more using Person’s r] and the wording of the questions is the same across different grades or subjects, only the question with the highest absolute effect size at the OECD level has been included in the classroom, school and overall indices. Where the effect sizes of different questions within a module are correlated, but the wording differs, both questions have been included as separate items within the indicator. Questions have also been retained for indices at subject and grade level where the possibility of correlation is not a problem.
Developing and reporting the indices
The indices developed are intended to show the extent of change or innovation in one country when compared with other countries. They can be used to rank countries according to their relative levels of innovation across levels, disciplines and in more specific educational practices.
Discipline, education level and overall innovation indices for each country =100 x (weighted average of absolute effect sizes)
ICT and thematic innovation indices do not accord any weight to values, therefore the composite indices for each country= 100 x (unweighted average of absolute effect sizes)
The number of questions included depends on whether data exist in PISA and/or TIMSS/PIRLS and therefore differs across education systems. It also clearly depends on the indicator itself: up to 33 questions are used in the reading innovation index compared to 49 in science for example. The number of questions included across ICT and thematic indices also varies considerably.
It is possible for the absolute effect sizes to take a value that is greater than one; however in practice they mostly range between 0 and 1; the indices can therefore take values from 0 to positive infinity but in practice they never cross 100 for the broad composite indices. For the ICT and specific composite indices the index itself has the same range as the broader ones but their decomposition shows the negative and positive contributions as well.
Cautions
Question inclusion
The indices combine information from a large and diverse pool of questions asked on different surveys. On the assumption that each question can provide additional information about the extent of change and innovation in an education system, the process employed to develop the indices has drawn on as many of the questions as possible and their inclusion has been determined by the availability of valid data. However, a more theoretical approach focusing on the most relevant questions, or a statistical approach to data reduction may provide different results.
Education system coverage
The indices provide some information about a subset of the education systems discussed in the previous chapters. This subset has been determined by the availability of data. It may be the case that other systems sit at the extremes of the ranking. It should be noted that the inclusion or removal of education systems would also impact on the estimation of missing values. Although it gives a robust synthesis of change covered by our change indicators, the country ranking should not be over-interpreted.
OECD average
The OECD average is computed for all the education systems for which data are available for all years concerned. In calculating the weights of regions that do not correspond to an entire OECD member the following procedure has been followed. Education systems that are part of a country for which the overall data is available are not considered – this being the case for the different states in the United States. Conversely, education systems that do not have a figure for the whole country they belong to have been given weight equal to 1- this being the case, for example of Ontario and Flanders (Belgium) among others.
Time periods
The effect size of the change in responses to a particular question is typically calculated across the same two points of time for each country but the two points in time may differ by question. The indices therefore show a tendency to change or innovate across slightly different time periods, rather than the extent of change over a specific time period.
Interpreting the findings
The indices reported help the reader to consider the benefits of such a composite innovation indicator based on change measures, but may not provide a fully accurate representation of the level of change and innovation within a country. Whilst the indicator is based on many questions and observations, the missing data imputation and correction which were needed to construct the innovation indices invites the reader to be cautious. The innovation indices are mere indicators of innovation, and small differences in levels are almost certainly not meaningful.
A higher score on the indicator suggests that an education system is characterised by more change than other systems. However, there is currently no theory that could be applied to describe the different levels in terms of adequacy of innovation. Similarly, the scale does not provide information about what is necessary to move from one point to another. Additional work could be undertaken to develop qualitative descriptions of different points on the scale, but this should be preceded by improved data collection.
Component indicators of the ICT based and thematic composite indices