4. Testing cognitive functions in children: A clinical perspective

Sylvie Chokron
Institut de Neuropsychologie, Neurovision & Neurocognition, Hôpital-Fondation Adolphe de Rothschild; Integrative Neuroscience and Cognition Center, CNRS UMR 8002 & Université de Paris

Neuropsychology represents the linkage between behaviour (including cognitive function) and the brain substrate. Neuropsychological assessment provides standardised, objective and reliable measures of diverse aspects of human behaviour. This allows for the specification of each individual’s unique profile (Ivnik, Smith and Gerhan, 2001[1]).

Brain-behaviour relationships in a developing child are both qualitatively and quantitatively different than those in adults. Neuropsychological evaluation in children can thus be seen as measuring the result of the interaction between development, and in particular, brain maturation, environmental stimulations, education, effect of brain lesion (in brain-damaged children), brain plasticity and proposed rehabilitation.

Referral to a child neuropsychologist is appropriate for diverse clinical, research and/or academic reasons. Whatever the context, tests results are considered along with a careful history-taking, precise clinical observations and respect for the socio-cultural context to best describe the child’s cognitive function. This is particularly important given that neuropsychological evaluation may have some predictive value for a child. It will, for example, often serve as a basis for academic or vocational decisions and therapeutic programmes.

In clinical practice, the neuropsychological evaluation has several aims. First, it describes the child weaknesses and strengths to set an appropriate rehabilitation and education programme. Second, it aims to educate the school and family about the child’s needs.

Testing models have different strategies. Indeed, the kind of test to be used among children referred for an evaluation is still debated. Should it be fixed battery, flexible battery, process approach or personal core battery?

In clinical practice, some flexibility is often required between the models. A given test and the way it is administered will influence the evaluation. A fixed battery, for example, seems useful in obtaining normative and research data. However, an inflexible fixed battery approach often proved insensitive and inappropriate to children with instrumental deficits (verbal, motor, attentional or perceptual). In addition, using a fixed battery might be far from the clinical complaints of the child or from the family or school questioning about the child’s difficulties.

These flexible batteries are thus built by adding some own personal core batteries, or some tests to a fixed battery based on individual needs. Tailoring the evaluation can better evaluate domains or subdomains omitted in the rigidly structured battery. In this way, the neuropsychological evaluation can include more qualitative data to nuance the objective, quantitative data obtained through fixed batteries.

Of course, the child must be understood in a larger context. The tests rely on standardised, objective and reliable measures of behaviour. However, neuropsychologists have to compare these measures to the psychological, educational and socio-cultural context of the examined child. This helps understand to which extent the test results are imputable to brain-based mechanisms or to other environmental factors.

The neuropsychologist usually performs a profile analysis at the end of the evaluation to define a child’s neurocognitive strengths and weaknesses. The analysis of quantitative features with respect to appropriate normative data is then integrated with qualitative observations about the child’s individual style, temperament and environment.

What is observed during the evaluation could largely differ from the actual cognitive level of the child. The child’s cognition may be influenced by various factors. These range from personality traits such as intellectual inhibition, anxiety, fear and lack of self-confidence to biological factors such as fatigue and hunger. This intra-variability is inherent to human testing and is not expected in testing artificial intelligence (AI) devices or robots. From this point of view, psychometric tests are more adapted to robots than to humans.

Regarding the framework used for examining data, different approaches to interpreting neuropsychological test data use cross-sectional data: absolute scores, difference scores, profile variability and change scores. Absolute scores refer to a single score from each test that might best differentiate each diagnostic group. Difference scores compare performance on tests sensitive to neurocognitive dysfunction with that on tests resistant to these effects. Profile variability assumes that impairment will affect performance in a different way across a range of tests. Change scores refer to longitudinal data obtained at test-retest intervals (Ivnik, Smith and Gerhan, 2001[1]).

Before describing the most commonly used neuropsychological tests around the world, the major advantages and limitations of intelligence tests are presented.

The ubiquitous IQ test provides the experienced clinician with a well-researched statistical basis for interpretation of individual subtest and factor index scores. Often, a quick appraisal of subtest scaled scores will offer the clinician pertinent cues about potential strengths and weaknesses, despite a successful performance. Although IQ tests are not designed to diagnose neuropsychological deficits, low performance on coding and picture completion, for example, will alert towards an attentional deficit. Conversely, lowered scores in block design subtest with intact verbal scores may point to praxis or visuo-spatial deficit. In addition, intelligence tests often consist of multiple parts that are co-normed, allowing the child’s performance on one subtest to be compared directly with performance on another.

When the IQ is above what is expected at his/her age, this well-standardised measure will exclude any neuropsychological deficiency. However, when the IQ is below expectations, the question of the origin of this lowered score arises. A lowered IQ can be associated to various conditions ranging from intellectual deficiency to lack of motivation, cultural or even physical reasons (e.g. sleep debt, virus).

Comparing the performance of a child and a robot on IQ tests raises the question of the underlying cause of the performance. In any case, many factors can potentially influence the performance of a human subject on IQ tests. Thus, they cannot be considered as a reflection of only a subject's intellectual efficiency.

Despite their prevalence in psychological testing, several limitations are associated with the administration of intelligence tests in child neuropsychology practice.

First, IQ tests had not been standardised among pathological populations. It is thus difficult to propose such tests to children with instrumental (verbal, motor, perceptual) or attentional deficits.

Along those lines, the results to IQ tests might be insufficiently sensitive or specific for neuropsychological evaluation. In particular, psychometric tests say nothing about the nature and the aetiology of the deficit causing the lowered IQ. This is probably because intelligence tests are not instruments validated with respect to brain function. Rather, they simply raise a suspicion that must be tested with other measures. For example, the interpretation of a significant split between verbal and performance IQ can be difficult (almost impossible) to interpret only on the basis of the intelligence test. Indeed, the performance scale might be lowered due to motor, spatial, visual or even non-cognitive factors, making discussion on IQ superfluous. This is a major concern in children whose motor or visuo-spatial deficit is still undiagnosed prior to the intelligence test administration.

Thus, whereas IQ tests represent a broad sweep of cognitive performance, neuropsychological evaluation attempts to provide a finer delineation of meaningful elements about how a child perceives, integrates and expresses information. In this way, the neuropsychological evaluation aims at examining the child’s discrete behaviour across a wide variety of domains, especially with tests validated with respect to brain function.

Although intelligence tests are often used in children with learning disabilities, they are not “neuropsychological instruments” nor were they constructed for such use. In addition, an IQ score as a summary score does not reveal the basic processes that contributed to, or negatively impacted, the child’s functioning. Finally, the duration of standardised intelligence tests in neuropsychological evaluation of children who may suffer from attentional, perceptual or motor disorders is another disadvantage.

Despite all these limitations, an intelligence test is routinely administered to a child referred for a neuropsychological evaluation when such testing has not been recently obtained. In many industrialised countries, the referral source, the socio-medical institutions and the school system often expect its administration.

The cognitive tests most often used in neuropsychology are presented below. In addition, more specific tests evaluating precise cognitive functions are highlighted; they represent an alternative to IQ tests in children with instrumental or learning disabilities.

This section focuses on the main cognitive tests used in neuropsychological practice. It begins with complete evaluations testing the whole cognitive function before examining other tests that evaluate more specific areas (see Table 4.1). In these tests, neuropsychology considers the following taxonomy of abilities: perception, attention, memory, executive functions, language, verbal and visual reasoning, and calculation. It then looks at how these different abilities are measured.

The following five subtests are proposed to younger children (ages 2-6 to 3-11):

  • Receptive vocabulary

This subtest measures a child’s ability to identify correct responses to spoken words (e.g. identifying a picture of a “fish” that represents the word “fish” spoken by the examiner). However, most of the receptive or expressive vocabulary subtests use visual pictures that must be designated or named by the children.

If the child is unable to visually explore, perceive, analyse or recognise the visual items, it will be impossible for him/her to perform the task. If this happens, it most often leads to a suspicion of a verbal deficit because the task is considered as involving language. However, it is first a visual task that can be failed because of a visual or spatial deficit.

  • Information

This subtest measures general cultural knowledge, long-term memory and acquired facts. This subtest is largely linked to culture, education and school knowledge.

  • Block design

This subtest measures an individual’s ability to analyse and synthesise an abstract design and reproduce that design from coloured plastic blocks. It involves visuo-spatial analysis, simultaneous processing, visual-motor co-ordination, dexterity and non-verbal concept formation.

  • Object assembly

As for the block design, this subtest measures an individual’s ability to analyse and synthesise an abstract design and reproduce that design from coloured plastic blocks.

  • Picture naming

This subtest assesses an individual's ability to name pictorial stimuli. This subtest involves verbal capacities but also visual capacities. It can thus be seen as a language test but also as a visual recognition test.

The 14 following subtests are proposed to older children (Ages 4-0 to 7-7):

  • Block design

This subtest has the same principles as the one for younger children.

  • Similarities

This subtest is supposed to measure verbal reasoning. However, like other subtests, it also involves visual, spatial and attentional cognition. Two similar but different objects or concepts are presented. The student is asked to tell how they are alike or different (e.g. what do the ear and eye have in common?)

  • Picture concepts

Students are asked to look at two (or three) rows of pictured objects and indicate (by pointing) the single picture from each row that shares a characteristic in common with the single picture(s) from the other row(s). This subtest aims to evaluate categorical and abstract reasoning. However, it first measures visual, spatial and attentional processing.

  • Coding

In this subtest, children must learn associations between non-verbal shapes and to complete (draw) the corresponding shape in an empty box as quickly as possible. Non-verbal associative and short-term (working) memory are involved. In addition, this subtest also involves other abilities such as visual analysis, fine motor dexterity, speed, accuracy and ability to manipulate a pencil. Given this context, so many processes contribute to success at this task that it can be failed for multiple reasons. In the simplest form of the test, children use their ink daubers and stamp the shape that goes with each animal according to the “key” given to them. For example, they would stamp a “circle” when they see a fish, a “star” when they see a cat and a “square” when they see a turtle.

  • Vocabulary

This subtest measures several verbal abilities: verbal fluency, concept formation, word knowledge and word usage. It is thus influenced by prior knowledge and, in this way, by culture and education.

  • Matrix reasoning

In this subtest, the subject must perceive, analyse and understand the logical rule between shapes. As above-mentioned, although this task is supposed to evaluate non-verbal reasoning, it also measures visual, spatial and attentional cognition.

  • Comprehension

In this subtest, the child must answer questions based on his or her understanding of general principles and social situations. This subtest measures the students’ verbal comprehension but also common sense social knowledge, practical judgement in social situations and social maturity. In addition, it evaluates the student’s moral conscience.

  • Symbol search

This subtest requires the student to determine whether a target symbol (geometric form) appears among the symbols shown in a search group. This subtest requires perceptual, spatial, attentional and speed abilities. For this reason, children can fail this subtest because of visuo-spatial deficits or slowness.

  • Picture completion

This subtest measures a student's ability to recognise familiar items and to identify missing parts (e.g. naming the missing sand in an egg timer). In this way, this subtest requires analysis of fine visual details, as well as visual, attentional and visual memory abilities.

  • Information measures

This subtest measures general cultural knowledge and long-term didactic memory. For this reason, this subtest depends on the ability of the student to recall facts and information previously taught in school.

  • Word reasoning

This subtest measures verbal abstract reasoning requiring analogical and categorical thinking, as well as verbal concept formation and expression.

  • Receptive vocabulary

This subtest requires the child to look at a group of four pictures and point to the one the examiner names aloud. As in most of the verbal subtests, the child needs first to visually analyse and recognise the stimulus. However, this subtest is considered to rely mostly on prior verbal knowledge.

  • Object assembly

This subtest embodies the same principles as the subtest for younger children.

  • Picture naming

This subtest embodies the same principles as the one for younger children.

Several non-verbal tests, as well as verbal subtests, are influenced by visual perception, spatial cognition and sustained attention. Indeed, among the 14 different subtests, 10 require good abilities in visual and spatial processing. In this way, failure on these subtests can easily result from visuo-spatial or attentional deficits rather than a pure intellectual deficit. In addition, many subtests are time-limited and can thus also be affected by slowness. On the other hand, several verbal tests involve prior word knowledge and in this way are sensitive to culture and education.

The K-ABC (Kaufman and Kaufman, 1983[2]) was normed on 2 000 children, aged 2 years 6 months to 12 years 5 months. This test is based on Luria’s theory of simultaneous and successive information processing. Its approach defines intelligence in terms of the child’s problem-solving and processing styles. Seven subtests are dedicated to simultaneous processing whereas sequential processing is tested through three subtests. There are also six achievement subtests.

  • K-ABC sequential process scale

Hand movements: the child must reproduce a sequence of gestures shown by the examiner.

Immediate memory of numbers: the child must repeat a series of numbers in the order of sequence.

Sequence of words: the child points to object profiles in the order given by the examiner. For added difficulty, children over five years old must also give the colour names before pointing at the objects.

  • K-ABC simultaneous process scale

Pattern recognition: the child mentally completes an unfinished drawing and names or describes it to the examiner.

Triangles: triangles, blue on one side and yellow on the other, are presented to the child who assembles them to reproduce a model.

Analogue matrices: the child completes incomplete matrices with pictures.

Spatial memory: the child memorises the space occupied by drawings randomly arranged on a page.

Series of pictures: the child puts back in chronological order pictures that form a little story.

The resolution of the items by the tested child is not limited by time. To help assess cultural or linguistic minorities, as well as children with auditory or verbal deficits, a non-verbal scale of six subtests was included that involves pantomime presentation of instructions. This test battery also includes a series of tests in which the examiner gives instructions by gestures.

The correlation between the Wechsler scale test results (Raiford, 2018[3]), which is the main intelligence scale for children, and the K-ABC test results is quite good. However, in the cases of learning disabilities, the K-ABC generally performs better than the WISC-R. The K-ABC would seem to be better suited than the other tests to analyse the reasons for academic failure. This is especially the case when the failures are unexpected or when the children come from a socio-cultural minority background. The K-ABC would allow a better perception of where the child's difficulties lie, especially in cases of autism spectrum disorders and dyslexia.

The NEPSY was composed of tests created for children from 3-12 years old (Brooks, Sherman and Strauss, 2009[4]). It allows neuropsychologists to understand the cognitive, behavioural and academic problems of young children. At the same time, it allows them to detect and define the deficit in brain-damaged children.

As for the K-ABC, following Luria's approach, the NEPSY allows to identify the primary deficits that may underlie various learning disorders. The NEPSY was originally developed in Finland for young children. It was subsequently extended and normed on 1 000 English-speaking children, 100 at each age level, from 3 to 12 years.

The NEPSY focuses on five broad functional domains: attention and executive functions; language; sensorimotor functions; visuo-spatial treatments; and memory and learning. Each domain is tested with specific subtests at each age (see Table 4.2).

For each domain, there are core tests. In addition, supplementary tests allow a possible deficit to be examined in greater depth when the results of the core tests are low. Additional marks allow a particular deficit to be specified. Finally, qualitative observations are made (e.g. on the child's behaviour or strategy).

This section looks at different kinds of tests for attention, memory, visual and spatial cognition, visuo-motor co-ordination and executive functions.

Vast amounts of information reach humans at every moment, both from the outside world (visual, auditory, tactile, etc.) and also from the body (internal temperature, heartbeat, pains or position of the body in space). The central nervous system cannot process all this information simultaneously. Therefore, it selects the most relevant information to be processed at each moment by the brain (Chokron, 2010[5]).

Among the information that comes to us, some is recurrent, usual and “predictable”. Other information will “capture our attention” because it is new, particularly interesting, unexpected, etc. The role of attention is to select and privilege the information to be processed on the basis of its novelty, relevance or the constraints of the moment and motivation. A degree of attentional control is necessary for any successful task performance.

Sustained attention or vigilance corresponds to a state of alertness that allows us to be receptive to the presented information. Under normal waking conditions, humans possess sustained attention abilities that allow us to interact effectively with the outside world. Of course, there are great variations within and between individuals in the way they implement this ability. Sustained attention can thus vary from the normal state (without brain damage) depending on the time of day, physical condition, psychological state, external conditions or motivation (Chokron, 2010[5]).

In addition, there is selective attention, which, as its name suggests, is based on the notion of information selection. Selective spatial attention refers to the ability to select information in some portions of the external space. Divided attention refers to the ability to respond to more than one task or event simultaneously. Finally, alternating attention or mental shifting refers to the ability to maintain mental flexibility to shift from one task requirement to another when these have different cognitive requirements.

The ability to orient attention develops gradually from birth. Internally driven ability to scan the environment is actively established by five or six years of age. The ability to focus attention on a sensory source or on a task is expected to be established by the age of seven, and sustained attention abilities develop until adolescence (Helland and Asbjornsen, 2000[6]).

In clinical neuropsychology practice, attention is a process or domain that is assessed as one component contributing to the child’s overall neurocognitive competence. Fatigue affects attention, as well as several factors such as motivation, affective state, hunger, etc. Therefore, the failure of a cognitive test (whatever its nature) could be due to a decrease in attentional capacities. Attention is not mediated by a single brain region or by the brain as a whole. Instead, it is carried out by discrete anatomical networks, within which specific computations are assigned to different brain areas (Posner and Petersen, 1990[7]).

A number of attention types, or subdomains, are described in the literature. Commonly, one refers to terms such as focused attention, switching or mental set shifting, and divided attention.

The Test of Everyday Attention for Children (TEA-Ch) (Manly et al., 1999[8]) is a children’s version of the adult eight-subtest Test of Everyday Attention (TEA) (Robertson et al., 1995[9]). TEA-Ch measures different components of attention (selective attention, attentional control, sustained attention) through the nine subtests described below.

  • Selective attention

Sky search: this subtest requires that the child filters information to detect relevant information and reject or inhibit distracting information. Specifically, the child must seek pairs of “spaceship” stimuli and rapidly circle all occurrences amid competing non-paired stimuli.

Map mission: in this subtest, the children are given a printed A3 laminated city map. They have to circle as many as possible visual targets (among 80 targets) and ignore distractors.

  • Attentional control

Creature counting: this subtest measures attentional control and switching that requires executive functions such as working memory and mental flexibility to count stimuli according to visual cues to either count up or count down.

Opposite worlds: this is a timed measure of attentional control and switching requiring the child to read sequenced chains of numbers as they appear (same world condition) or to inhibit the prepotent response and respond with an alternate number (i.e. 1 for 2, 2 for 1 different world conditions). This subtest makes the stimulus (a digit) and the response (the word “one” or “two”) association explicit. This subtest is thus like the conflicting response requirement for the Stroop test.

  • Sustained attention

Score!: this subtest presents ten item-counting measures. In each item, between 9 and 15 identical tones of 345 ms are presented, separated by silent interstimulus intervals of variable duration (between 500 and 5 000 ms). Children are asked to silently count the tones (without assistance of fingers) and to give the total at the end.

Code transmission: this subtest is an auditory vigilance-level measure. The child has to listen to a taped 12-minute recording of single-digit numbers presented at 2s intervals. The child has to immediately announce the digit presented just before “55” when they hear the number “55”. The score given is the number of digits correctly announced by the participant. There are 40 target presentations. This subtest is a variation of an n-back task.

  • Sustained, divided attention and response inhibition

Sky search DT: in this subtest, children have to circle paired spaceship stimuli (as in the sky search task), while also keeping a count of auditory tones until all target stimuli are circled.

Score!DT: this subtest measures sustained auditory attention requiring the child to listen to and count tape recorded tones, while also listening for an animal named by the announcer in a news broadcast.

Walk/Don’t walk: this measures sustained attention and response inhibition. A child learns tones that allow progression (go) or require inhibition (no-go) and then makes a mark accordingly. The speed of tone presentation increases as the task progresses. The child must avoid making a mark in the no-go condition.

Many batteries have been developed largely by including downward extension of tasks from adult memory tests rather than by constructing tests with developmental principles in mind. The most used batteries are the Rivermead Behavioural Memory Test for children aged five-ten years old, the Test of Memory and Learning, and the NEPSY learning and memory subtests, as well as the Children’s Memory Scale (CMS) described below.

The CMS is an extension of the Wechsler Memory Scale series for adults (Lichtenberger and Kaufman, 2004[10]). The CMS was designed as part of a standard psychological or neuropsychological evaluation to provide a comprehensive assessment of learning and memory in children and adolescents of ages 5-16 years.

Summary scores of the CMS include verbal immediate, verbal delayed, visual immediate, visual delayed, general memory, learning delayed recognition and attention/concentration indexes. Immediate, delayed recall and delayed recognition scores are converted to scaled scores. Core subtests comprise stories and word tests (for verbal memory indexes); dot locations and faces (for visual/non-memory indexes); and numbers and sequences (for attention/concentration indexes) (Cohen, 2011[11]).

Most of the subtests proposed in the different batteries involve visual and spatial cognition for two reasons.

First, it is crucial to evaluate visuo-spatial cognition in children to disentangle a perceptual deficit from an intellectual deficit. Importantly, visuo-spatial deficits can bias not only non-verbal evaluation but also verbal subtests. For example, vocabulary is often evaluated in children through picture naming, which requires visual perception, analysis and understanding.

Second, cortical visual impairments are frequent in at-risk children (born preterm or in a neurological context as with perinatal asphyxia). While visual acuity is often screened, visual function (i.e. visual and spatial cognition) is largely neglected. For this reason, children often receive a diagnosis of intellectual or co-ordination or motor disorder whereas they suffer from a visual deficit that affects cognitive tasks involving vision, or motor or co-ordination tasks.

Specific batteries such as the Developmental Test of Visual Perception (DTVP) (Hammill, Pearson and Voress, 2013[12]) or the Evaluation of Visuo-attentional abilities (EVA) (Cavézian et al., 2010[13]; Chokron, 2015[14]) are available to evaluate visuo-spatial and attentional capacities in children. In the DTVP, for example, the subject has to retrieve and point to the shapes embedded in the figure presented on the top. In the EVA battery, the subject is first presented with the figure on the top and, then ten seconds later, must find the represented figure among the six propositions.

Visuo-motor co-ordination is tested as visual cognition is required in a number of tasks. It can be assessed in children to evaluate the presence, nature and impact of a visuo-motor co-ordination disorder on other cognitive tasks.

The Purdue Pegboard, developed in 1948, has been used most extensively in personnel selection for jobs that require fine and gross motor dexterity (Podell, 2011[15]). The test measures the gross motor dexterity of hands, fingers and arms, as well as the fine motor dexterity of fingertips. In addition to being employed in personnel selection, the Purdue Pegboard test has also been used in neuropsychological assessments. It has been found to be sensitive to the presence of brain damage (especially frontal or parietal) or of visuo-motor co-ordination deficit.

The Purdue Pegboard is a board featuring two rows of 25 holes each. At the top of the board are four cups. The pins (pegs) are kept in the outer left and right cups, and the collars and washers are kept in the middle cups. There are four subtests of the Purdue Pegboard, done with the dominant hand, the non-dominant hand and with both hands. The subject has 30 seconds to place as many pins as possible in the holes, starting at the top of the right row in the right-hand test and at the top of the left row in the left-hand test. In the third subtest for both hands, the subject fills both rows, starting at the top, for the same amount of time. In the fourth subtest, the subject is asked to use both hands to construct “assemblies” of a pin, a washer, a collar and another washer for 60 seconds.

Besides global cognitive evaluation, it might be necessary to measure specific cognitive processes such as executive functions. Executive function has been defined in various ways [see Eslinger (1996[16]) for a review]. Executive function maintains an appropriate set to achieve a future goal (Luria, 1973[17]). For Baddeley (1986[18]), executive function refers to those mechanisms by which performance is optimised in situations requiring the simultaneous operation of a number of different cognitive processes. For Welsh, Pennington and Groissier (1991[19]), it involves mostly strategic planning, impulse control and organised search, as well as flexibility of thought and action. For Denckla (1989[20]), it requires the ability to plan and sequence complex behaviours, simultaneously attend to multiple sources of information, grasp the gist of a complex situation, resist distraction and interference, inhibit inappropriate responses and sustain behaviour for prolonged periods.

In this way, executive functions are thus higher functions that integrate others that are more basic, such as perception, attention and memory. These higher functions include the abilities to anticipate, establish goals, plan, monitor results and use feedback (Stuss and Benson, 1986[21]). Executive function also refers to regulatory control (Nigg, 2000[22]) and to a set of processes that guide, direct and manage cognitive, emotional and behavioural functions, especially during active, novel problem solving (Gioia et al., 2000[23]).

According to Baron (2018[24]) executive function can be seen as the “metacognitive capacities that allow an individual to perceive stimuli from his or her environment, respond adaptively, flexibly change direction, anticipate future goals, consider consequences and respond in an integrated or common sense way, utilising all these capacities to serve a common purposive goal.”

Executive function is thus heterogeneous and includes both broad and specific behaviours. Indeed, executive function has become an umbrella term that encompasses a number of subdomains, some more consistently endorsed than others.

In summary, executive function refers to a set of subdomains such as set shifting, hypothesis generation, concept formation, abstract reasoning, planning, organisation, goal setting, fluency, working memory, inhibition, self-monitoring, initiative, self-control, mental flexibility, attentional control, anticipation, estimation, behavioural regulation, common sense and creativity.

As its role in executive function, inhibition mediates response selection in planning and problem-solving tasks (Levin et al., 2001[25]). To respond accurately to a question, one has to inhibit inaccurate responses, as well as incorrect reasoning. During the two last decades, inhibition has received a lot of interest as investigators attempt to parcel out contributions to effective or impaired inhibitory function. A variety of forms of inhibition are described (Nigg, 2000[22]), such as cognitive (intellectual) inhibition, interference control and motor or oculomotor inhibition. Intellectual inhibition may bias the results of cognitive evaluation even in gifted children (with IQ largely above average).

There are substantial data indicating that response inhibition is mediated by frontal cerebral regions (Stuss and Benson, 1986[21]; Mega and Cummings, 1994[26]). Patients with frontal dysfunction may exhibit too much or not enough inhibition depending on the lesion location and the type of task proposed. The developmental trajectory on inhibition tasks appears linked to prefrontal maturation (Levin et al., 2001[25]). Anterior regions of the frontal cortex continue to develop throughout childhood and into adolescence, thus influencing the level of inhibition.

Because frontal lobe is developing in children, unlike in adults, a clinician must consider that various strategies and/or neural pathways might operate at different maturational stages. Often the qualitative observations of the child’s performance will add critical insight into inhibitory strength or weakness. As a result, behavioural observations and error analysis become particularly useful. For example, repetition errors suggest a failure to successfully self-monitor. Perseverative errors further suggest difficulty inhibiting previous response patterns and shifting to a new response set.

Inhibition can also be evaluated in attentional tests where, for example, subjects have to inhibit distractors in visual search tasks or in divided attention tasks. The Wisconsin Card Sorting Test (WSCT) is commonly used to measure inhibition and executive function (Kolakowsky-Hayner, 2011[27]). This test assesses judgement, reasoning, hypothesis generation, initiation, flexibility and inhibition. In the standard administration of the WCST, four stimulus cards are placed in front of the child. Two sets of 64 response cards become the child’s deck. The child must match each consecutive response card to the examiner’s stimulus cards according to the examiner’s (unstated) principle or rule. However, the principle keeps changing at a designated time. The child must thus discover the principle and adjust the sorting accordingly. The child will propose an answer and be told if it is right or wrong. The criterion is six complete correct sorts or until all 128 cards are attempted.

The demand for neuropsychological evaluation varies considerably according to many needs. What are the deficits observed in the context of a neurological pathology? What are the effects of treatment? What is the nature of a learning disability? What are the risks of extreme prematurity or what are the effects of a psychological disorder? What are the child's needs in terms of academic adaptation or management? Is there a neuropsychological explanation to a specific behaviour?

Depending on the origin of the request and the needs, the neuropsychological evaluation takes the form of a screening intervention, a complete check-up with a view to a diagnosis or a more in-depth analysis with a view to a rehabilitation or a pedagogical project. The neuropsychological evaluation must establish not only the weak and strong points of cognitive functioning but also give the best possible account of the difficulties encountered in daily life. For this reason, questionnaires are also often used with parents and teachers to assess socio-adaptive behaviour, emotional and behavioural disorders and quality of life. Indeed, these factors may well influence or interact with the cognitive abilities being assessed. At the same time, more ecological tools using stimulations have proven their relevance in child neuropsychology.

For a long time, the neuropsychological evaluation process was confronted with a lack of tools. Today, the offer in terms of tests is considerable. Nevertheless, this does not guarantee a valid assessment of the child's cognitive profile.

The next section discusses these tests, the interpretation biases that may result and functions that remain impossible to evaluate.

The performance of a robot, machine or program is considered stable and invariable over time. However, neuropsychological tests have shown significant variability in the performance of human adults and children. This could occur between different subtests of a battery or within the same subtest over time.

Measuring natural intra-individual variability in neuropsychological tests is thus crucial. It allows the evaluator to avoid systematically interpreting these variations as an improvement or deterioration in performance. This variability could be associated with a recovery or a worsening of the disorders in the case of a brain injury.

Schretlen et al. (2003[28]) investigated the normal range of intra-individual variation in neuropsychological tests in adults. The authors derived 32 z-transformed scores from 15 tests administered to 197 adult participants in a study of normal ageing. The difference between each person's highest and lowest scores was computed to assess his or her maximum discrepancy (MD). The results show that 66% of participants produced MD values that exceeded three SDs. Eliminating each person's highest and lowest test scores decreased their MDs, but 27% of participants still produced MD values exceeding three. Although conducted in adults, this study revealed that marked intra-individual variability is common in normal participants, which of course, is not expected in machines.

Along the same lines, Zabel et al. (2009[29]) examined the test-retest reliability of selected variables from the computerised continuous performance test (CPT). Participants were 39 healthy children aged 6-18 without intellectual impairment. The authors found that test-retest reliability was modest for CPT scores. This study suggests a considerable degree of normal variability in attentional scores over extended test-retest intervals in healthy children. These findings suggest a need for caution when interpreting test score changes in neurologically unstable clinical populations. They also underline that one cannot expect stable performance during a test (especially involving attentional resources) over time in humans unlike with robots and machines.

Taken together, these studies emphasise the difficulty to directly compare human to machine performance in neuropsychological, psychometric or attentional tests. In this way, machine performance could only be compared to a range of human performance.

Despite the use of standardised tests in child neuropsychology, it remains difficult to precisely establish the cognitive origin of a deficit. For example, a subject can fail a spatial reasoning subtest due to an attentional, visual, spatial or reasoning deficit. In a similar way, the naming task, most often considered as a verbal task (lexical evocation) can be failed because of a visual recognition problem. Indeed, in the WWPSI, the vast majority of the proposed subtests involve visual perception or visuo-spatial analysis. They can therefore be failed due to a perceptual rather than intellectual disorder.

Only the clinical sense of the neuropsychologist can enable him/her to propose a set of tests involving the different processes involved in the failed subtest. By analysing the associations and dissociations between the performance in these different tests, the neuropsychologist will be able to rule on the cognitive process at stake. This point can be illustrated through a simple example of dissociation between performances on similar tasks that differ only in the sensory modality involved. A child may be able to write a word when it is dictated orally but be unable to copy it when it is presented visually. Is there a disorder of perception and/or visual analysis that makes the copying task impossible?

In a similar way, the performance of a battery of perceptual, language and memory tests for several hours in a row, without difficulty, makes it unlikely there is a disabling sustained attentional disorder. Conversely, a decrease in performance as a test is taken argues in favour of a problem sustaining attention over time. Because of the multitude of cognitive processes involved in such complex tasks, a failure on a given task, analysed in isolation, does not say anything about the underlying deficient processes.

In addition, regarding machines, the use of psychometric tasks will require a good understanding of the way the task is executed to compare robot and human performance. Different strategies used by robots and humans, for example, could explain the discrepancy between their performance. When asked to name an object presented visually, humans cannot avoid activating its function, although this is not useful for the naming task. For this reason, a tool will take longer to be named than a flower because humans cannot prevent themselves to activate its function (although this is totally irrelevant to the task). Of course, a machine will never do that. For this reason, it will remain difficult to compare the performance of machines and humans. Machines can thus outperform humans because they have less access to related knowledge.

Many authors have attempted to correlate IQ with children's academic success or adults' career success. The results in this area are contradictory. No clear link can be established between IQ score, grade level and subsequent success.

High potential children can also have relatively disabling learning and academic difficulties. These disorders can sometimes be so marked that they can mask the diagnosis of intellectual precociousness. Neuropsychologists will often see children with severe academic difficulties for whom a psychometric assessment is requested to eliminate an intellectual disability. Yet these children may obtain an IQ score much higher than the average. These children can sometimes present a heterogeneous profile on psychometric tests with an over-investment of verbal skills to the detriment of visual-spatial skills.

The origin of these disorders is not clearly understood. However, it is often hypothesised that these children need tasks of sufficient complexity to recruit their attention and enable them to complete a task. Thus, in a completely counter-intuitive way, a high IQ may interfere with the performance of overly simple tasks for which the subject does not recruit his or her attention and therefore scores below average. Moreover, this is observed in some subjects with more difficulty recalling a series of numbers in the same order (right-side up) than in reverse (reverse order).

Similarly, there is no strong argument in favour of a correlation between IQ and academic achievement, except perhaps for subjects whose performance is well below their age group. Some studies do show such correlations, but a critique is nonetheless warranted.

IQ tests were originally developed with the constraint of correlating with grade level. Not surprisingly, some subtests such as the arithmetic test correlate well with academic performance. IQ and academic achievement are both products of socio-cultural level and family environment. There is no formal evidence that IQ is exceptionally high among geniuses. The reverse is also true: a low IQ does not necessarily go hand in hand with a total lack of intelligence. This is especially true since failure on IQ tests may not be due to an intellectual deficit. Rather, it could be due to a deficit in the perceptual, motor and language skills required for IQ tests.

In this way, if comparing children to robots through IQ tests, the question arises of what exactly is being compared. IQ tests, which are supposed to test intellectual efficiency, are unlikely to actually measure intellectual ability. Rather, these tests measure a subject's ability to respond to tests that are supposed to measure intelligence (Chokron, 2014[30]). This ability may be subject to a large number of factors far removed from intellectual ability. One can thus expect that a robot, a machine or an algorithm, can answer a question in a more adapted, invariable and systematic way if it has been well programmed for that. Does this mean it will be more intelligent than the human subject?

For the past few decades, tests have been over-used both in children for school orientation decisions and for adults in the context of professional recruitment. This is all the more surprising since IQ does not absolutely predict academic or professional success. It is sometimes even the opposite since a non-negligible number of intellectually precocious children can present significant learning disorders.

Furthermore, training in one of the particular subtests of the IQ scales, such as number memory, induces an increase in performance for that subtest. However, it does not improve performance in the other subtests or in the same subtest carried out with other items (letters instead of numbers, for example). Therefore, how could one imagine the IQ score could predict an employee's skills in a specific position that requires the performance of tasks completely different from those required during the test?

These results are confirmed in various studies among different populations such as young street vendors in Brazil (Carraher, Carraher and Schliemann, 1985[31]). In this study, the participants demonstrate remarkable mental arithmetic skills. Yet, when subjected to the arithmetic subtest of psychometric tests, their performance proves to be much lower than expected for their age group. Performance on the same problem proposed in an abstract way during a test and in a concrete way in an ecological situation generally has little correlation.

This type of study underlines the need to consider two aspects of tasks when assessing cognitive abilities. First, there are the processes involved in the task to be performed. Second, there is the ecological character of the task and its proximity to tasks the subject performs in everyday life. This also refers to the distinction between a “psychometric test” and an ecological task. This must be considered when comparing subjects to each other or to machines. Along those lines, the notion of task seems more suitable to evaluate human performance than the one of test.

The comparison of cognitive performance between human subjects and robots also refers to other important distinctions. Human subjects show a variability in their performance over time, as well as between different tasks in a battery that is not expected in robots. Moreover, robots, unlike human subjects, are not expected to be hampered by their affective state, their attentional or motivational level.

Most intellectual efficiency tests evaluate conscious processes, whereas a large part of cognitive processes is unconscious or implicit (Schacter, 1992[32]). Intellectual efficiency would inevitably be different without the good functioning of these conscious and unconscious processes. However, it remains difficult to model these unconscious processes. In a comparison of intellectual performance between robots and humans, it might be interesting to consider how the distinction between conscious and unconscious processes can be operationalised.

Child neuropsychology has made tremendous progress over the past 50 years. As the profession has grown, there has been a better understanding of cognitive and brain development and a growing interest in developmental issues. Tests for children are becoming increasingly specific and standardised rather than simple adaptations of tests for adults. In addition, research in child psychology is improving our knowledge of cognitive processes during development. It also makes it possible to develop new tests that correspond more closely to the dynamics of the child's cognitive and cerebral development. All these elements contribute to the development of child neuropsychology.

Nevertheless, in spite of this considerable progress, much research is needed to compensate for persistent weaknesses in child neuropsychological testing. Indeed, intelligence tests also contribute to these weaknesses. This point is raised not to undermine the role and value of neuropsychological or cognitive assessment. Rather, it seeks to raise awareness of the limitations of testing.

These limitations could be addressed both in clinical practice and research. In clinical practice, they can be addressed through the clinical meaning and judgement, knowledge and experience of the well-trained and responsible neuropsychologist in both normal and abnormal child populations. Meanwhile, in comparing human to machines’ intellectual performance, researchers can reflect on how much problem-solving processes may differ in and between humans depending on a multitude of factors that do not affect robots in the same way.

In the case of a direct comparison between the performance of human subjects and of machines on psychometric tests, researchers would need to compare the subjects during tasks aiming at measuring basic functions. These comprise perceptual, attentional or memory tasks. They should involve as little affect as possible and use abstract stimuli to avoid biases in human subjects due to the use of semantic data that are totally inappropriate for the task. In addition, tasks should be performed within a limited period of time (ideally a few minutes for each subtest) to avoid any sustained attentional difficulties in humans that are absent in robots.

Moreover, to avoid any problem of motivation or investment in the human subjects, researchers should choose voluntary subjects. The subjects need to invest themselves totally in the requested task to give the best of themselves. The proposed task may bring into play subjective judgements, involving affects, emotions and choice. It may also consider previous experience or data stored in memory that could be related to the resolution of the presented task. The more this occurs, the more humans will likely outperform machines. Conversely, machines can be expected to outperform humans as tasks become simpler, more automated, more abstract and even more repetitive. Of course, selecting only certain items and not the entire test battery to compare performance of machines to humans will require a reference population of human subjects. It will be impossible to use the test norms obtained when standardising the entire battery.


[18] Baddeley, A. (1986), Working Memory, Oxford University Press, Oxford.

[24] Baron, I. (2018), Neuropsychological Evaluation of the Child : Domains, Methods, and Case Studies, Oxford University Press, Oxford.

[4] Brooks, B., M. Sherman and E. Strauss (2009), “NEPSY-II: A developmental neuropsychological assessment, Second edition”, Child Neuropsychology, Vol. 16/1, pp. 80-101, https://doi.org/10.1080/09297040903146966.

[31] Carraher, T., D. Carraher and A. Schliemann (1985), “Mathematics in the streets and in schools”, British Journal of Developmental Psychology, Vol. 3/1, pp. 21-29, https://doi.org/10.1111/j.2044-835X.1985.tb00951.x.

[13] Cavézian, C. et al. (2010), “Assessment of visuo-attentional abilities in young children with or without visual disorder: Toward a systematic screening in the general population”, Research in Developmental Disabilities, Vol. 31/5, pp. 1102-1108, https://doi.org/10.1016/j.ridd.2010.03.006.

[14] Chokron, S. (2015), “Evaluation of visuo-spatial abilities (EVA) : A simple and rapid battery to screen for CVI in young children”, in Lueck, A. and G. Dutton (eds.), Impairment of Vision due to Disorders of the Visual Brain in Childhood: A Practical Approach, American Foundation for the Blind Press, Arlington, VA.

[30] Chokron, S. (2014), Peut-on mesurer l’intelligence?, Editions le Pommier, Paris.

[5] Chokron, S. (2010), Pourquoi et comment fait-on attention ?, Editions le Pommier, Paris.

[11] Cohen, M. (2011), “Children’s memory scale”, in Kreutzer, J., J. DeLuca and B. Caplan (eds.), Encyclopedia of Clinical Neuropsychology, Springer, New York.

[20] Denckla, M. (1989), “Executive function, the overlap zone between attention deficit hyperactivity disorder and learning disabilities”, International Pediatrics, Vol. 4/2, pp. 155-160.

[16] Eslinger, P. (1996), “Conceptualizing, describing and measuring components of executive function”, in Lyon, R. and N. Krasnegor (eds.), Attention, Memory and Executive Function, Paul H Brookes, Baltimore, MD.

[23] Gioia, G. et al. (2000), Behaviour Rating Inventory of Executive Function, Psychological Assessment Resources, Inc., Odessa, FL.

[12] Hammill, D., N. Pearson and J. Voress (eds.) (2013), Development Test of Visual Perception – Third Edition (DTVP – 3), Pearson, Toronto.

[6] Helland, T. and A. Asbjornsen (2000), “Executive function in dyslexia”, Child Neuropsychology, Vol. 6/1, pp. 37-48, https://doi.org/10.1076/0929-7049(200003)6:1;1-B;FT037.

[1] Ivnik, R., G. Smith and J. Gerhan (2001), “Understanding the diagnostic capabilities of cognitive tests”, The Clinical Neuropsychologist, Vol. 15/1, pp. 114-124, https://doi.org/10.1076/clin.

[2] Kaufman, A. and N. Kaufman (1983), Kaufman Assessment Battery for Children Interpretive Manual, American Guidance Service, Circle Pines, MN.

[27] Kolakowsky-Hayner, S. (2011), “Wisconsin card sorting test”, in Kreutzer, J., J. DeLuca and B. Caplan (eds.), Encyclopedia of Clinical Neuropsychology, Springer, New York.

[25] Levin, H. et al. (2001), “Porteus maze performance following traumatic brain injury in children”, Neuropsychology, Vol. 15/4, pp. 557-567, https://doi.org/10.1037//0894-4105.15.4.55.

[10] Lichtenberger, E. and A. Kaufman (eds.) (2004), Essentials of WPPSI-III Assessment, John Wiley & Sons Inc., Hobken, NJ.

[17] Luria, A. (1973), The Working Brain. An Introduction to Neuropsychology, Penguin, Harmondsworth, UK.

[8] Manly, T. et al. (1999), The Test of Everyday Attention for Children Manual, Thames Valley Test Co Ltd.

[26] Mega, M. and J. Cummings (1994), “Frontal subcortical circuits and neuropsychiatric disorders”, Journal of Neuropsychiatry and Clinical Neuroscience, Vol. 6/4, pp. 358-370, https://doi.org/10.1176/jnp.6.4.358.

[22] Nigg, J. (2000), “On inhibition/dishinibition in developmental psychopathology: Views from cognitive and personality psychology and a working inhibition taxonomy”, Psychological Bulletin, Vol. 126/2, pp. 220-246, https://doi.org/10.1037/0033-2909.126.2.220.

[15] Podell, K. (2011), “Purdue pegboard”, in Kreutzer, J., J. DeLuca and B. Caplan (eds.), Encyclopedia of Clinical Neuropsychology, Springer, New York.

[7] Posner, M. and S. Petersen (1990), “The attention system of the human brain”, Annual Review of Neuroscience, Vol. 13, pp. 25-42, https://doi.org/10.1146/annurev.ne.13.030190.000325.

[3] Raiford, S. (2018), “The Wechsler Intelligence Scale for Children—Fifth Edition Integrated.”, in Flanagan, D. and E. McDonough (eds.), Contemporary intellectual assessment: Theories, tests, and issues, 4th ed., The Guilford Press, New York, NY, US.

[9] Robertson, I. et al. (1995), Test of Everyday Attention, Thames Valley Test Company, Ltd., Bury, St. Edmonds, UK.

[32] Schacter, D. (1992), “Implicit knowledge: New perspectives on unconscious processes”, Proceedings of the National Academy of Sciences U.S.A., Vol. 89/23, pp. 11113-11117, https://doi.org/10.1073/pnas.89.23.11113.

[28] Schretlen, D. et al. (2003), “Examining the range of normal intraindividual variability in neuropsychological test performance”, Journal of the International Neuropsychology Society, Vol. 9/6, pp. 864-70, https://doi.org/10.1017/S1355617703960061.

[21] Stuss, D. and D. Benson (1986), The Frontal Lobes, Raven Press, New York.

[19] Welsh, M., B. Pennington and D. Groissier (1991), “A normative developmental study of the executive function: A window on prefrontal function in children”, Developmental Neuropsychology, Vol. 7, pp. 131-149, https://doi.org/10.1080/87565649109540483.

[29] Zabel, T. et al. (2009), “Reliability Concerns in the Repeated Computerized Assessment of Attention in Children”, The Clinical Neuropsychologist, Vol. 23/7, pp. 1213-1231, https://doi.org/10.1080/13854040902855358.

Metadata, Legal and Rights

This document, as well as any data and map included herein, are without prejudice to the status of or sovereignty over any territory, to the delimitation of international frontiers and boundaries and to the name of any territory, city or area. Extracts from publications may be subject to additional disclaimers, which are set out in the complete version of the publication, available at the link provided.

© OECD 2021

The use of this work, whether digital or print, is governed by the Terms and Conditions to be found at https://www.oecd.org/termsandconditions.