Chapter 3. Making sense of early tracking in the Netherlands

The Dutch school system is highly stratified with extensive early tracking. Early tracking is controversial, but student outcomes in the Netherlands are good on average and in terms of equity. However, the integrity of the tracking system is increasingly challenged, with evidence pointing to large student performance differences within educational tracks (programmes), and seeming growing inequity in educational opportunities between disadvantaged and more advantaged students. This chapter analyses the challenges of the system for initial selection and allocation of students into different tracks and proposes options for improvement. It highlights the importance of a national and objective test to determine the initial tracking decision. It also examines ways of improving the permeability of the system through a drastic reduction of down-tracking and grade repetition, and strong differentiated teaching skills to identify strong performers within classrooms and support their potential promotion to a higher track.


The pros and cons of early tracking

Secondary education can be comprehensive, or, as in the Netherlands, involve multiple separate tracks

In comprehensive education systems, children of different ability levels attend the same school and follow the same educational programmes for a long time. Schools and teachers cater to a wide range of student abilities, and ability grouping is typically within the same school or even the same class, which allows ready transition between difficulty levels. Students can often follow different subjects at different difficulty levels. In stratified systems, however, children are separated (sometimes as early as lower secondary level) into different educational programmes or “tracks” according to their abilities. These education systems are typically found in German-speaking countries, Eastern Europe, the Flemish Community of Belgium and the Netherlands and can be more or less stratified, depending on the age at selection and/or the number of programmes they offer to students (Bol et al., 2014; Education Council, 2010; Prokic-Breuer and Dronkers, 2012).

The merits of “early” tracking (after primary school) have been extensively debated

Proponents of early tracking argue that grouping students by ability leads to more efficient learning, which makes it easier for teachers to target the right level. They argue that if the ability distribution in the classroom is too large and teachers target the average, both the strongest and the weakest performers will suffer (Hallinan, 1994; Lazear, 2001; Sund, 2013). Critics of early tracking point to the risks for low-achievers. A rich literature documents peer effects: performance is improved by more able students, and reduced by less able students in the same classroom. Tracked systems tend to deprive low-performing students of the positive peer effects1 from stronger students. In addition, students in vocational tracks are often subject to a very different curriculum that sets them on a learning trajectory from which it is subsequently hard to escape (Korthals, 2015; Sund, 2013). Finally, since students develop at a different pace, early selection can easily result in the misallocation of students, a particular problem if initial misallocation is hard to rectify.

Evidence from cross-country studies on the overall effects is uncertain

The discussion around tracking leads to the question of who gains and who loses and by how much. Among cross-country studies, Hanushek and Woessmann (2006) find that early tracking increases inequality and has no clear effect on average achievement. Other studies have obtained different results. Waldinger (2006) finds that tracking has no effect on the relationship between family background and achievement and concludes that there is no evidence of it having a negative impact on equity. Jakubowski and Pokropek (2015) compare achievement progress between primary and secondary education to find that progress is lower in early tracking countries, with more negative outcomes for boys and low-achieving students. Brunello and Checchi (2007) analyse International Adult Literacy Survey (IALS) data (another popular source for international comparisons) to yield ambiguous findings that suggest different effects on literacy and future earnings.

Studies based on variation within countries produce similarly mixed results

Figlio and Page (2002) use US data to investigate the effects of tracking on students’ mathematics test scores and find that tracking does not harm low-ability students or benefit high-ability students. Looking at the comprehensive school reform2 in Finland, Pekkarinen, Uusitalo and Kerr (2009) find a small positive effect of the reform on the verbal test scores of students from poorly educated families, but no effect on performance in arithmetic or logical reasoning. In the Dutch context, Van der Steeg, Vermeer and Lanser (2011) find that tracking has positive effects on the performance of students at the top of the performance distribution, but no negative effect for average students. Van Elk, Van der Steeg and Webbink (2011) suggest that increasing participation in comprehensive classes (combined general secondary education (HAVO) and pre-university education (VWO) classes) would increase graduation from higher education.

Ability grouping is not the only difference between selective and comprehensive school systems

The institutional context of tracking in terms of school choice, teacher selection, curricular arrangements, and funding arrangements etc. varies among countries. This may explain some of the ambiguity of the research evidence and suggests that the effects of tracking may be different in different countries (Pekkarinen, Uusitalo and Kerr, 2009).

Despite early tracking, student outcomes in the Netherlands are good on average and in respect of equity

It may be expected that the peer effects of early tracking damage the performance of students at the lower end of the performance scale and improve the performance of the best students, thus extending the performance distribution at both the top and bottom. However, the Dutch results starkly contradict this hypothesis, with strong results at the bottom end of the distribution and slightly disappointing results at the top end (see Chapter 4). For example, the Netherlands had one of the smallest percentages of low-performing students in mathematics (15%3 ) in the Programme for International Student Assessment (PISA) 2012 (OECD average of 23%). Moreover, when compared to other top-performing countries, the average score of students at the bottom of the performance distribution is relatively high. Similarly, PISA 2012 showed that the performance gap between students with an immigrant background and native students is smaller than in countries with a similar size and nature of migrant population (such as Germany, Austria or Sweden). So one major argument held by critics, that early tracking damages equity, is difficult to sustain in the case of the Netherlands.

School selection and its link to tracking

Large performance differences within tracks, and performance overlaps across tracks, are a problem

Figure 3.1 presents an analysis of student performance across different educational tracks based on PISA 2012.4 The Figures show that there is an extremely large variation in performance at any given track. They also show that there is a lot of overlap in literacy and numeracy performance of students in one track compared with those in another. It suggests that in any one track a very large group of students in the Netherlands have the same cognitive skills as in the “next” track, despite having been placed in different tracks. This, for example, means that many of the best HAVO students are performing as well as the weaker performing VWO students.

There is considerable school segregation within educational tracks

Ensuring consistently high standards across schools is a formidable challenge for any school system. PISA 2012 suggests that a considerable percentage of the total variation in student performance within tracks, about 20% on average, can be attributed to differences in performance5 between schools. Performance differences are highest in the pre-vocational education (VMBO-g/t) tracks at 26%. Other top-performing countries with comprehensive systems, such as Canada, Finland and Poland, have similar results, with one striking difference: their results are at the system level while the Dutch results are at the track level. In other words, in these countries student performance depends much less on the school a student is going to than in the Netherlands.

Figure 3.1. The cognitive skills of students in different educational tracks, PISA 2012
PISA mathematics score distribution, by educational track
PISA reading score distribution, by educational track

Note: The markers around middle point of each graph indicate standard deviations.

Source: OECD calculations based on the OECD PISA 2012 data base.

The problem of inconsistent selection criteria

Inconsistent selection undermines the rationale of tracking

The rationale for tracking assumes that students with a certain level of cognitive skills will be best served in an educational programme that sufficiently motivates and challenges them in their learning. However, as suggested by the OECD’s analysis, a considerable proportion of students are finding themselves in educational programmes that do not necessarily match their cognitive skills. The process of selection into tracks is a relevant factor for two main reasons:

  1. The results of the end of primary test6 have never been consistently linked to primary school recommendations; and primary schools can differ radically in how they use these test scores to advise on track placement in secondary education. With the same test scores, students could easily obtain recommendations for higher tracks at some schools (or in some regions), while other schools could be more restrictive in their advice (Education Council, 2014; Van der Werfhorst, 2014). Overall there has been a tendency, especially in big cities under the pressure of ambitious parents, to “inflate” the end of primary test results, which leads to an increased percentage of students being advised to take the higher track.

  2. Secondary schools in the Netherlands are free to select students and impose additional selection requirements that may go beyond the primary school’s advice. For example, some elite independent gymnasia only accept students with exceptionally high CITO7 scores. Secondary schools may be more selective and make extensive use of their autonomy as they are under pressure to speed up educational trajectories and improve their results. This is typically the case with popular secondary schools or in parts of the country where there are population pressures that exceed demand. Schools with declining student rolls may feel inclined to be more lenient when it comes to student test scores. Students who have the same track recommendations but live in different parts of the country may therefore be more or less successful in obtaining entrance to the track of their choice.

The recent reform that places more emphasis on teacher assessment will not improve the consistency of selection

Since 2014/15, the national end of primary test has been de-emphasised as the main instrument for determining the educational track of students in favour of teacher assessments of students’ cognitive skills. This reform was based on the observation that end of primary test scores were used too restrictively. For example, students with weak test results, who were given the benefit of the doubt in terms of track placement, often managed to do well and obtain their qualification at the higher educational level (Education Council, 2014).

But reliance on teacher assessments risks both bias and inconsistency

Despite the good intentions behind this reform, it is fraught with risk. Although primary school teachers know their students, and are, in principle, capable of assessing them multi-dimensionally and through time, there are several reasons why this shift in emphasis from a national test to a teacher assessment could increase inconsistency:

  1. Even if teachers know their students, they are not in a position to compare their own students with a national sample, so they will not normally know if they are making higher or lower recommendations than teachers across the Netherlands.

  2. Teacher judgement is often biased in favour of children from advantaged backgrounds (Waldinger, 2006; for Netherlands-specific findings, see Timmermans, Kuyper and van der Werf, 2015). Higher track recommendations than would be expected, given the end of primary test score, are mainly obtained by children with higher socio-economic backgrounds in the Netherlands (Education Council, 2014).

  3. Teacher assessments can be biased by pressure from articulate parents willing and able to argue the case of their child (Hillmert and Jacob, 2010; van der Werfhorst and Hofstede, 2007). This adds to the risk of bias against those from disadvantaged backgrounds. The end of primary test now takes place later in the year, i.e. after the decision of a student’s placement in secondary school has been made. Although primary schools may adjust their advice when test results are higher than the initial advice, this rarely happens. Low-educated parents are also found to rarely object to low school advice (Korpershoek et al., 2016).

Growing inequity in track placement

The most recent report of the Inspectorate of Education (2016) looks at track selection and student placements. The results clearly point to growing inequity in track placement. With the same results on national tests at the end of primary education, children of lower socio-economic background are increasingly more likely to be placed in lower tracks compared to their more advantaged peers.

Permeability between tracks after selection

Alongside effective initial selection, tracking requires subsequent permeability between educational tracks

Students develop their skills at different paces, so any assessment administered at one point in time will not accurately predict later performance. If initial tracking decisions are subject to error, for the reasons discussed above, this adds to the importance of subsequent permeability. This means that any stratified education system needs effective mechanisms to allow initial tracking decisions to be adjusted in response to performance (Checchi and Flabbi, 2007; Jakubowski and Pokropek, 2015). In the Netherlands, students with the potential to switch to higher tracks face two challenges: 1) reduced expectations and opportunities to learn in lower tracks may mean that their potential cannot be developed and realised; and 2) there may be direct obstacles to transition (discussed in the next section).

National learning goals and central examinations determine programmes in each track

In the absence of a national curriculum, the national learning goals prescribed by the Ministry of Education, Culture and Science (MoECS), together with central examinations at the end of secondary education, guide Dutch teachers in setting learning goals for their students. Because of stratification, learning goals are set differently for every secondary school track, translating into what is taught and examined in each track. With many different educational tracks, permeability between tracks depends on how well learning goals, allied to sensitive teaching practice, align between the tracks and thereby support promotion to higher tracks. The degree of alignment is vital, given the evidence of overlap in the cognitive skills of students between tracks. Keeping students in schools where expectations and the curriculum taught are below their potential level means that students’ talents are not fully exploited.

Promotion to higher tracks faces increasing practical obstacles

Students seeking promotion to a higher track often need additional support to catch up on material that was not studied in the lower track. How to provide this additional support is left to the discretion of schools and is therefore applied variably. Some HAVO schools introduce additional selection criteria; others oblige students to follow “reparatory” classes to catch up with the learning goals of HAVO. In many schools, student motivation will be a key factor in determining success as students are not always offered support. The reduction in the number of larger schools, with VMBO, HAVO and VWO within one school, also plays a role. For years the Netherlands has witnessed a nationwide trend of creating separate schools by school type (e.g. a VMBO school). This development provides an additional obstacle to students seeking promotion to higher tracks (Inspectorate of Education, 2016). As a consequence of these obstacles in the current system, there are only a small number of students that stream upwards from vocational to general tracks, and of those that do, a considerable proportion fail to achieve a diploma at that level. For example, the data show that among the VMBO students that go on to HAVO, about 25% do not obtain a HAVO diploma (Inspectorate of Education, 2015).

Strong differentiated teaching skills are needed to support permeability

The capacity of teachers to assess individual students, develop potential, and promote able students to higher tracks is a key element in the permeability of the system (Jakubowski and Pokropek, 2015). This means that even in a highly tracked system, teachers need strong differentiated teaching skills: schools can never afford to assume that initial tracking has ensured a homogeneous classroom. As students develop variably, they may excel in some subjects but not others. Many teachers lack the skills to systematically assess students and differentiate their teaching to individual learning needs (see Chapter 5). Weak teaching practice that offers no timely and practical response to struggling students, except grade repetition or down streaming, leads to many low performing students repeating or being down-tracked. Similarly, the (potentially) best-performing students in the class may not be sufficiently challenged in their learning to follow subjects at a higher level or even to seek promotion to a higher track. So underdeveloped differentiated teaching skills add to the downward pressure on the mobility of students, with down-tracking a too easy option and track promotion too difficult.

Grade repetition and down-tracking

One quarter of students in secondary education repeat a grade or are down-tracked

Given the inconsistent selection criteria applied before secondary school, all actors in the system may reasonably anticipate the possibility of a further “filtering out” of students in each track. Unfortunately, this diminishes the incentives for teachers to target support at struggling students to keep them on track. Grade repetition is often viewed as a necessary cost of obtaining good end results8 and/or as a good alternative to down-tracking. Students are “offered” an additional year in the same grade and the time to mature in the same track. It is widely perceived that being promoted to the next grade before being sufficiently proficient may increase the risk of failure and lead to frustration as lower achieving students are not able to cope with more demanding learning tasks (Ikeda and García, 2014). However, any classroom will have students who struggle with the material, and asking students to repeat a year, or down-tracking them, may not be the best approach; with additional support during or after lessons, or at summer schools, those same students may well succeed.

Recommendations 2-4: Reform initial selection process and subsequent permeability of tracks

Recommendation 2: Consider options for reducing the extent of early tracking, as one component of a reform package

Potentially reduce the extent of early tracking

The Dutch system of early tracking faces growing problems. Initial selection into tracks is far too variable and some recent trends and policy developments have exacerbated the challenge of managing early tracking. For example, it has become increasingly difficult to achieve track promotion, meaning that the scope to correct misallocations is falling. There are large overlaps in the cognitive skills of students in different tracks. It could be argued that these issues illustrate the intrinsic flaws of early tracking and that the system requires reforms to reduce tracking in favour of a more comprehensive education. While recognising the logic of this argument, radical wholesale change may be difficult as the Dutch education system achieves good results overall. One reason for this may be that early tracking reflects to some extent the preferences of some students for applied topics, as well as academic selection. In the Netherlands,9 previous attempts to radically change the education system have often proven costly and counterproductive (Van der Werfhorst, Elffers and Karsten, 2015). However, more modest options for reducing the number of tracks, or postponing the age of first tracking, should remain as potential components of reform.

Recommendation 3: Establish a student’s right to enter a track based on a national objective test, and require schools to respect national test standards when selecting students into tracks and subsequently sustaining them in those tracks

There is a tension between consistent tracking criteria and local decision-making

The integrity of the early tracking system is under pressure. There is a tension (some may call it a contradiction) between the central principle of tracking, that students of given performance levels are best suited to a particular educational track, and local school decision-making, which leaves the track allocation decision to the highly variable discretion of local actors. This review argues that if the integrity of tracking is to be sustained, the discretion of local actors has to be substantially restrained.

Base the track decision primarily on a national standardised test

An objective track decision requires a single national end of primary test, which could be extended to examine a broader range of competences than at present. Nationally set objective standards on the required scores for each track level should be established and should determine entry to different tracks. Local discretion by primary teachers and the receiving secondary schools create both inconsistency and bias and should be removed from the decision. The transparency of such a system would be fair to all students.

Implementation of the system would require local co-ordination

Applying common standards in track selection requires the compliance of schools in accepting all students who meet the nationally agreed standards. Schools should have limited freedom in introducing their own selection criteria after the initial selection. Local school co-ordination would be required to manage demand and supply so that holders of a test “ticket” are granted the right to enter a particular track in a local school or schools.

Schools would also need strict limits on their capacity to make students repeat grades or be down-tracked

Limits would be necessary as otherwise schools could accept students but then swiftly push them into a lower track or a lower year, thus subverting, through local discretion, the objectives of the national system. Limits on down-tracking and grade repetition are desirable in their own right. Grade repetition is both costly and relatively ineffective when compared with alternative measures of targeted supports for students who struggle at school, and, in the Dutch context, the best defence of repetition is that it is often preferable to down-tracking.

A virtuous circle would link changes in schooling policy to strengthened differentiated teaching

These structural reforms would strongly encourage, and be supported by, changes in teaching practice designed to give more weight to differentiated teaching skills. A reduction of grade repetition and down-tracking calls for more attention to alternative interventions designed to support struggling students within a particular track to succeed. The importance of such differentiated teaching skills is underlined in Chapters 4 and 5. The aim would be to create a virtuous circle in which schools faced with the requirement to sustain students in the same grade and track actively seek and develop innovative solutions to achieve this objective. Some central support measures from MoECS would be necessary to support this development.

Recommendation 4: Promote permeability between all tracks by (a) facilitating upward transition between tracks throughout the school career and (b) merging some tracks

Curricula and learning goals of different tracks should be aligned to facilitate track promotion

Even if initial track selection is conducted as well as possible, some “late bloomers” will need to be promoted to a higher track. Currently, different educational tracks are associated with different learning opportunities, with the gaps being particularly large between vocational and general education. This means that by the time a “late bloomer” is identified they will have to overcome a curricular gulf. Instead, curricula and learning goals need to design in, rather than design out, the possibility of track promotion.

Promoting larger secondary schools through financial incentives

To facilitate track promotion, there is a need to reverse the downward trend in the number of larger schools, ensuring that VMBO, HAVO and VWO remain within one school. The projected demographic decline of the secondary education student population provides further reason – and an opportunity – for promoting larger schools through financial incentives built into secondary education school financing.

Some tracks could usefully be merged

Permeability will be easier if there are fewer tracks and therefore fewer boundaries to manage. The overlaps between the cognitive skills of students in different tracks would be substantially reduced if there were fewer tracks. There is already an active policy debate in the Netherlands about different options for merging tracks, and the OECD in its review of vocational education and training in the Netherlands recommended that VMBO-b and VMBO-k should be merged (Fazekas and Litjens, 2014). Some mergers of different tracks, alongside the other measures discussed above, would help ensure that all students are in the right track.


← 1. This only applies if peer effects are non-linear (Hoxby, 2000).

← 2. The Finnish comprehensive school reform abolished the old two-track school system and created a uniform 9-year comprehensive school, effectively delaying age of selection and reducing stratification (Pekkarinen, Uusitalo and Kerr, 2009).

← 3. Defined here as level 1 or below.

← 4. VMBO-g and VMBO-t are fused for practical purposes (very small N in VMBO-g in the PISA sample).

← 5. This applies for all three PISA domains: literacy, numeracy and problem solving. The additional calculations are available upon request.

← 6. The CITO test is an end of primary attainment test. Schools are required to report on the extent to which their students have reached expected core learning objectives. While schools are free to use different instruments for this purpose, the vast majority of schools use CITO’s end-of-primary test, which provides information on the school type most suitable for each student in the next phase of education. Since the 2014/15 school year it is mandatory for primary schools to administrate regular student monitoring systems, as well as a final test at the end of Year 8. Schools can choose to administer the CITO tests or alternative tests, provided they meet central quality requirements (Nusche et al., 2014).

← 7. At the end of primary education, vast majority of schools administer an aptitude test called the CITO Eindtoets Basisonderwijs (“CITO final test primary education” abbreviated to CITO test [CITO test]), developed by the Central Institute for Test Development, which is designed to recommend the type of secondary education best suited for a pupil given his or hers cognitive abilities.

← 8. This is evident from strong societal beliefs in the benefits of grade repetition in the Netherlands (Goos et al., 2013).

← 9. Such as, for example, the radical change in the system in 1999 through the so-called “Basisvorming” reform.