The Usefulness of PISA Data for Policy Makers, Researchers and Experts on Methodology
Are students well prepared to meet the challenges of the future? Are they able to analyse, reason and communicate their ideas effectively? Have they found the kinds of interests they can pursue throughout their lives as productive members of the economy and society? The OECD Programme for International Student Assessment (PISA) seeks to provide some answers to these questions through its surveys of key competencies of 15-year-old students. PISA surveys are administered every three years in OECD member countries and a group of partner countries, which together make up close to 90% of the world economy.
Exploratory Analysis Procedures
PISA surveys use complex methodologies that condition the way data should be analysed. As this is not yet included in standard procedures included in the statistical software packages such as SAS® or SPSS®, this manual describes the methodologies in detail and also presents syntax and macros developed specially for analysing the PISA data.
National and international surveys usually collect data from a sample. Dealing with a sample rather than the whole population is preferable for several reasons.
First, for a census, all members of the population need to be identified. This identification process presents no major difficulty for human populations in some countries, where national databases with the name and address of all, or nearly all, citizens may be available. However, in other countries, it is not possible for the researcher to identify all members or sampling units of the target population, mainly because it would be too time-consuming or because of the nature of the target population.
In most cases, as mentioned in Chapter 3, national and international surveys collect data from a sample instead of conducting a full census. However, for a particular population, there are thousands, if not millions of possible samples, and each of them does not necessarily yield the same estimates of population statistics. Every generalisation made from a sample, i.e. every estimate of a population statistic, has an associated uncertainty or risk of error. The sampling variance corresponds to the measure of this uncertainty due to sampling.
The Rasch Model
International surveys in education such as PISA are designed to estimate the performance in specific subject domains of various subgroups of students, at specific ages or grade levels.
For the surveys to be considered valid, many items need to be developed and included in the final tests. The OECD publications related to the assessment frameworks indicate the breadth and depth of the PISA domains, showing that many items are needed to assess a domain as broadly defined as, for example, mathematical literacy.
Education assessments can have two major purposes:
1. To measure the knowledge and skills of particular students. The performance of each student usually will have an impact on his or her future (school career, admission to post-secondary education, and so on). It is therefore particularly important to minimise the measurement error associated with each individual’s estimate.
2. To assess the knowledge or skills of a population. The performance of individuals will have no impact on their school career or professional life. In such a case, the goal of reducing error in making inferences about the target population is more important than the goal of reducing errors at the individual level.
Computation of Standard Errors
Analyses with Plausible Values
As described in Chapters 5 and 6, the cognitive data in PISA are scaled with the Rasch Model and the performance of students is denoted with plausible values (PVs). For minor domains, only one scale is included in the international databases. For major domains, a combined scale and several subscales are provided. For each scale and subscale, five plausible values per student are included in the international databases. This chapter describes how to perform analyses with plausible values.
Use of Proficiency Levels
The values for student performance in reading, mathematics, and science literacy are usually considered as continuous latent variables. In order to facilitate the interpretation of the scores assigned to students, the reading, mathematics and science scales were designed to have an average score of 500 points and a standard deviation of 100 across OECD countries. This means that about two-thirds of the OECD member country students perform between 400 and 600 points.
Analyses with School-Level Variables
The target population in PISA is 15-year-old students. This population was chosen because, at this age in most OECD countries, students are approaching the end of their compulsory schooling. Thus, PISA should be able to indicate the cumulative effect of a student’s education.
Standard Error on a Difference
This chapter will discuss the computation of standard errors on differences. Following a description of the statistical issues for such estimates, the different steps for computing such standard errors will be presented. Finally, the correction of the critical value for multiple comparisons will be discussed.
OECD Total and OECD Average
The PISA initial and thematic reports present results for each country and two additional aggregated estimates: the OECD total and the OECD average.
The OECD total considers all the OECD countries as a single entity, to which each country contributes proportionally to the number of 15-year-olds enrolled in its schools. To compute an OECD total estimate, data have to be weighted by the student final weight, i.e. W_FSTUWT.
Policy makers and researchers require information on how indicators change over time. An analysis of the impact of reforms on the education system, would be an example, where policy makers would seek to measure changes in the targeted area to gauge the effectiveness of their policies. In the early 1960s, for example, most OECD countries implemented education reforms to facilitate access to tertiary education, mainly through financial help. One indicator of the impact of these reforms would be to calculate the percentage of the population with a tertiary qualification for several years to show how this has evolved. Computing this trend indicator is a straightforward statistical manipulation, since the measure (i.e. whether or not an individual has completed tertiary education) is objective and available at the population level, in most cases. Nevertheless, such measures can be slightly biased by, for example, differing levels of immigration over a period of time, student exchange programmes, and so on.
Studying the Relationship between Student Performance and Indices Derived from Contextual Questionnaires
The PISA initial reports have used the following tools to describe the relationship between student performance and questionnaire indices: (i) dividing the questionnaire indices into quarters and then reporting the mean performance by quarter; (ii) the relative risk; (iii) the effect size; and (iv) the linear regression. This chapter discusses technical issues related to these four tools and presents some SAS® macros that facilitate their computation.
Over the last 20 years, education survey data have been increasingly analysed with multilevel models. Indeed, since simple linear regression models without taking into account the potential effects that may arise from the way in which students are assigned to schools or to classes within schools, they may provide an incomplete or misleading representation of efficiency in education systems. In some countries, for instance, the socioeconomic background of a student may partly determine the type of school that he or she attends and there may be little variation in the socio-economic background of students within each school. In other countries or systems, schools may draw on students from a wide range of socio-economic backgrounds, but within the school, the socio-economic background of the student impacts the type of class he or she is allocated to and, as a result, the within-school variance is affected. A linear regression model that does not take into account the hierarchical structure of the data will thus not differentiate between these two systems.
PISA and Policy Relevance – Three Examples of Analyses
This chapter will provide three examples of possible analyses with PISA data. The examples will begin with a concrete policy question which will be followed by a step-by-step analysis:
1. how to translate a policy question into a working hypothesis;
2. how to choose the most appropriate approach to answer the hypothesis; 3. how to compute, referring to the relevant chapters in this manual on technical matters; 4. how to interpret the results; 5. how to draw policy recommendations.
This chapter presents the 17 SAS® macros used in the previous chapters. These are also available from www.pisa.oecd.org. Table 17.1 presents a summary of the 17 SAS® macros. The file names are in blue and the macro names as well as their arguments are in black.
Add to Marked List