4. Understanding the six criteria: Definitions, elements for analysis and key challenges

This chapter introduces the six criteria (Figure 4.1), presenting them in the order in which they are most logically considered: starting with relevance and coherence, then effectiveness and efficiency, and finally impact and sustainability. Each criterion is defined and its importance described. Then the definition is further explained through an examination of its elements of analysis – the key concepts contained within the definition. These elements are not sub-criteria but illustrate the different ways the criteria can be applied according to the context and purpose. Connections with other criteria are also explored.

The chapter also includes initial thinking on key issues related to inclusion and the principle of “leaving no one behind”, as mandated in the 2030 Agenda. Further work is underway to explore applying an equity lens to the criteria and evaluation approaches, including specific guidance on gender equality, women’s empowerment and human rights.

For each of the criterion, a table outlines common challenges along with some ideas for evaluators and evaluation managers on overcoming these challenges. These tables will be updated over time as experiences with the new definitions are shared.

Finally, real world examples are provided to illustrate ways of interpreting each criterion.

Evaluating relevance helps users to understand if an intervention is doing the right thing. It allows evaluators to assess how clearly an intervention’s goals and implementation are aligned with beneficiary and stakeholder needs, and the priorities underpinning the intervention. It investigates if target stakeholders view the intervention as useful and valuable.

Relevance is a pertinent consideration across the programme or policy cycle from design to implementation. Relevance can also be considered in relation to global goals such as the Sustainable Development Goals (SDGs). It can be analysed via four potential elements for analysis: relevance to beneficiary and stakeholder needs, relevance to context, relevance of quality and design, and relevance over time. These are discussed in greater detail under the elements for analysis. They should be included as required by the purpose of the evaluation and are not exhaustive.

The evaluation of relevance should start by determining whether the objectives of the intervention are adequately defined, realistic and feasible, and whether the results are verifiable and aligned with current international standards for development interventions. This should fit the concept of evaluability, which is detailed in the OECD’s Quality Standards for Development Evaluation (OECD, 2010[1]) Results or objective statements may be poorly phrased or vague, difficult to measure or focused on activities or inputs. In some cases the theory of change must be refined or reconstructed for the evaluator to clearly identify these objectives. Evaluators should take care to assess against good quality and realistic objectives. The indicators for measuring the achievement of the objectives should also be validated according to generally accepted criteria such as the SMART (Specific, Measurable, Attainable, Relevant and Timely) indicators (IDD and Associates, 2006[2]). Evaluations which consider relevance should consider if and how appropriate objectives have been operationalised in ways that reflect good practice. They should also reflect on the organisational capacity and capability of implementing partners and their responses to any change in context.

Evaluators should also clearly identify stakeholders whose needs and priorities should be considered during the evaluation of relevance. This includes beneficiaries, as well as funding, oversight or implementing partners and institutions. A particular emphasis should be placed on beneficiaries. Ownership of an intervention is important and beneficiaries are considered first and foremost to be the primary stakeholders in defining priorities and needs. Depending on the intervention, it may also be pertinent to consider national and sub-national (where applicable) needs, local strategies to address needs and the extent to which the intervention aligns with those needs. Institutional needs can include but are not limited to donor needs, meaning relevance can be evaluated across policy contexts including those where there is no clear donor but instead partners in an intervention (e.g. in the case of trade policy).

The definition of relevance comprises four main dimensions: responding to needs, policies and priorities; being sensitive and responsive to context; quality of design; and responsiveness over time.

Perhaps the most important element for analysing relevance is the assessment of the extent to which an intervention addresses beneficiaries’ needs and priorities. This analysis provides insight into which issues an intervention addresses and why. Beneficiaries are central stakeholders for an intervention and should be considered throughout. Beneficiaries are not necessarily people receiving direct services. Depending on the type of intervention, beneficiaries can be (much) further upstream in the results chain. For example, an intervention may aim at increasing the capacity of a national audit office. These improved capacities will strengthen public financial management and ultimately contribute to achieving sustainable development goals such as improved health and education. But the beneficiaries for the purpose of evaluating the capacity support would focus on the audit office staff as the primary beneficiaries. Clearly defining the beneficiaries (primary and secondary) is a necessary first step to evaluating relevance.

Analysing beneficiaries’ needs and whether they are addressed sheds light not only on responsiveness but also on ownership and participation regarding the intervention’s design and implementation (which can affect other criteria). It helps to understand who is involved in the design and who is not and, in turn, how this affects the intervention’s design and implementation.

This criterion implies that efforts should focus on areas of greatest need, or in the language of the 2030 Agenda: reaching the furthest behind first. Indeed, relevance is particularly useful in understanding who is engaged in and reached by an intervention. Relevance provides an opportunity for evaluators to consider whether and to what extent marginalised groups are incorporated in both policy and intervention priorities. Even when an intervention is perfectly in sync with official policy, it may be disconnected from the real life priorities of the participants, who may not have been involved in setting official priorities and plans.

An evaluation of relevance should also take into account how the intervention addresses the priorities of involved institutions or partners. This includes government (national, regional, local), civil society organisations, private entities and international bodies involved in funding, implementing and/or overseeing the intervention. Relevance will examine alignment with these institutions strategies and policies.

To assess an intervention’s relevance to global needs, policies and priorities, an evaluation should review its contribution to overall global goals (i.e. the relative importance of this intervention compared to the broader effort). This will often involve evaluators comparing (potential) results in the context/country with alternatives elsewhere. Such questions regarding global relevance are not always examined during the intervention design. Evaluators can thus provide useful analysis on these questions to assist stakeholders in understanding the strategic significance of an intervention beyond its local context.

The definition of relevance also calls on evaluators to look at potential tensions or trade-offs with regard to whose needs and priorities are met through the intervention. Various perspectives of the participants and other stakeholders may be misaligned and so the evaluation will need to unpack these differences and explore the implications of choices made. To provide an example, interventions aimed at eliminating a disease – such as polio – from all countries would examine the relative disease burden to determine the global priority for action in a particular country or region. There may be cases where the global priority for that intervention (polio vaccinations in the last remaining region with community transmission) may be at odds with local priorities (with beneficiaries prioritising water and sanitation issues, for instance). It can be useful for the evaluators to unearth such tensions, through careful analysis of relevance.

The needs of beneficiaries and other key stakeholders cannot be understood in isolation and are shaped by their context. Context is multifaceted and includes the following factors: economic, environmental, equity, social, cultural, political economy and capacity considerations. Evaluators are encouraged to understand which contextual factors are most pertinent to an intervention.

Contextual relevance can be analysed both in intervention design and implementation. The consideration of context will also be dependent on whether an evaluation is ex ante or ex post. For example, evaluators can ask questions around how the context was understood and accounted for when the intervention was designed. For ex-post evaluations, evaluators should consider whether the context changed between the inception and the end of the intervention. Ex-post evaluations will have more context and should aim to incorporate this in their analysis. This complements the time element for analysis of the relevance criterion, by considering any fluctuations in the relevance of an intervention as circumstances change.

Historical context can also be considered. For example, have similar interventions occurred before? Are there historical tensions, legislation or politics that may impact the understanding of needs and shaping of goals? Historical context can also include assumptions that were made in the past about an intervention’s relevance and test if these persist in the current context. Where previous evaluations have been conducted, these assumptions may be helpful in tracing the historical context and whether interventions capitalise on lessons learned from previous evaluation exercises.

“Quality of design” considers how well the intervention was built to address relevant priorities and needs and whether goals have been clearly specified. Moreover, it assesses if stakeholders’ priorities and needs are articulated in the intervention’s objectives, its underlying theory of change, theory of action and/or modus-operandi. This allows evaluators to understand gaps in programme design that may have undermined an intervention’s overall relevance. This element for analysis also influences the evaluability of the overall intervention by adding a focus on the intervention’s design quality at the outset. It also provides insight into the intervention’s appropriateness to the institution implementing it. For example, evaluators can consider if it has been designed with technical, organisational and financial feasibility in mind.

Evaluators should consider how interventions can evolve over time. Outbreaks of conflict, or changing policy and economic contexts, for example, will significantly affect implementation. Relevance regarding time considerations should include adaptive management analysis. Evaluators should consider relevance not only at the beginning and end of a programme, but how it has responded to changes over the course of its lifespan. This allows evaluators to examine any fluctuations in the relevance of an intervention as implementation conditions change. For example, this could include an analysis of how suitable the adaptations were in continuing to meet the most important needs and priorities and whether adaptation affected the quality of the design over time.

Again, ongoing adaptation to external contexts and internal changes should be taken into account (e.g. when a change in funding necessitates a change in programming). Additionally, risks and opportunities can be considered, including the extent to which the programme mitigated risks that would undermine its purpose, or was adapted to seize a good opportunity, or to better meet needs. Adaptation may lead to trade-offs in whose needs are prioritised, raising questions of accountability. This should be fully explored to understand how it may, or may not, have altered a programme’s relevance.

As the criteria are interrelated, relevance can be linked to other criteria in the evaluation. Relevance is often viewed as a prerequisite for achieving the other criteria.

  • The evaluation of relevance provides a foundation to understand if needs are met as part of effectiveness and impact. Indeed, relevance as a criterion is a prerequisite for effectiveness as the identification of needs and goals must be clearly articulated to enable the assessment of effectiveness.

  • Relevance complements coherence. Both require contextual analysis: for relevance, in order to understand the alignment with priorities and needs of key stakeholders; and for coherence, so as to understand linkages with other interventions. Relevance focuses on how an intervention responds to context. Coherence zooms out, looking at other interventions in that context and how they interact with the intervention being evaluated. Taken together, relevance and coherence can provide a clearer view of how the intervention affects – and is affected by – the context in which it is implemented.

  • The analysis of relevance also relates to the impact criterion, which looks at the ultimate significance of an intervention – including its value to those affected. Evaluators should spend sufficient time examining the needs, priorities and policies of all actors (including potential tensions among them) to be able to sufficiently assess the overall relevance of the intervention and to further analyse its significance when looking at impact.

  • Many of the elements of relevance are critical factors in efficiency and sustainability: a relevant intervention is likely to have greater support among stakeholders, which can influence the timeliness of delivery and use of resources, as well as the degree of ownership of the resulting benefits (and thus their sustainability).

Understanding gendered power dynamics and reflecting on the SDG commitment to “leave no one behind” are crucial in understanding relevance. Gendered power dynamics and the marginalisation of certain groups – including racial/ethnic groups – are central considerations for understanding relevance in a particular context.

Understanding who was involved in intervention design and how they were involved, with special attention to power dynamics and marginalised groups, will help evaluators understand the relevance of the intervention as designed, as well as the extent to which the intervention was responsive to changing needs over time.

The definition of relevance above emphasises the importance of considering trade-offs between different needs and priorities, including greater consideration of equity and power dynamics between people affected by the intervention directly or indirectly.

Here there is a strong link with human rights and equality, particularly when an intersectional lens, which considers how multiple forms of social and political identity such as gender, disability, ethnicity, sexuality and social class combine to create discrimination and inequality, is applied. When identifying priorities for the analysis of relevance, it is essential to consider under-represented and marginalised groups (groups that may be restricted in their access to services and/or rights) and how their needs and priorities are – or are not – captured in formal documents and policies. In addition, it will be important to take into account whether the intervention incorporates different levels of access, given constraints faced by particular groups.

The table below identifies several common challenges when evaluating relevance – the range of needs and priorities to consider, poorly articulated objectives and changes in the context – and suggests ways of addressing them for both evaluators and evaluation managers.

This section includes a cross section of examples from evaluating the relevance of general budget support, farmer livelihood supports and health sector programmes.

In today’s world, greater attention must be paid to coherence, with an increased focus on the synergies (or trade-offs) between policy areas and the growing attention to cross-government co-ordination. This is particularly the case in settings of conflict and humanitarian response, and when addressing the climate emergency.

In line with the 2030 Agenda and the SDGs, this new criterion encourages an integrated approach and provides an important lens for assessing coherence including synergies, cross-government co-ordination and alignment with international norms and standards. It is a place to consider different trade-offs and tensions, and to identify situations where duplication of efforts or inconsistencies in approaches to implementing policies across government or different institutions can undermine overall progress.

This criterion also encourages evaluators to understand the role of an intervention within a particular system (organisation, sector, thematic area, country), as opposed to taking an exclusively intervention- or institution-centric perspective. Whilst external coherence seeks to understand if and how closely policy objectives of actors are aligned with international development goals, it becomes incomplete if it does not consider the interests, influence and power of other external actors. As such, a wider political economy perspective is valuable to understanding the coherence of interventions.

In addition, the sources (both international and domestic) of financing for sustainable development are increasingly diverse. The reference to “international norms and standards” in the definition encourages analysis of the consistency of the intervention with the actor’s own commitments under international law or agreements, such as anti-corruption statutes or human rights conventions. This applies to those agreements to which the entity has already committed and is therefore covered under internal coherence. Previously, this type of coherence was not often sufficiently analysed. International norms and standards may also be assessed under relevance from the viewpoint of responsiveness to global priorities, which is a complementary angle.

Coherence includes the dimensions of internal coherence and external coherence.

Internal policy coherence considers two factors: the alignment with the wider policy frameworks of the institutions; and the alignment with other interventions implemented by the institution including those of other departments responsible for implementing development interventions or interventions which may affect the same operating context.1 It should consider how harmonised these activities are, if duplication of effort and activities occurs, and if the interventions complement each other.

Within national governments (or, where applicable, other levels of government), challenges to coherence arise between different types of public policy, between different levels of government and between different stakeholders (both state and non-state, commercial and non-commercial). This should be carefully considered when evaluating coherence to understand where the intervention fits within this picture and the extent to which it is aligned with the policies governing the wider context.

For example, the Japanese Ministry of Foreign Affairs’ ODA Evaluation Guidelines (Ministry of Foreign Affairs Japan, 2019[6]) support commissioners and implementers of evaluations in assessing the coherence of diplomatic and development strategies and actions across the Japanese Government. These guidelines provide a framework and advice to help evaluators consider interconnections, complementarity and coherence of diplomatic and official development assistance (ODA) strategies. This supports a holistic analysis of Japan’s engagement and support for different sectors and countries.

Policy coherence can be understood from a horizontal perspective. For example, in the humanitarian-development-peace nexus there may be a strong need for coherence as one actor may have interventions covering development, military and security policy. In the environmental field, this could also refer to the need for coherence across the water-energy-food nexus, or the gender equality-climate change nexus. In other contexts, the ways in which non-development policy areas, such as trade, affect the intervention could be considered.

From a vertical perspective, policy coherence can be understood at different levels of an institution, or across different parts of a single government’s development finance (e.g. its bilateral agency, development finance institute [DFI] and multilateral support). It could also consider how the intervention supports or undermines policy goals across geographic levels. For example, it can consider how well a local development intervention aligns with national development objectives and interventions at a national level or vice versa.

External coherence has two main considerations: alignment with external policy commitments; and coherence with interventions implemented by other actors in a specific context.

From a policy perspective, external coherence considers the intervention’s alignment with external policy commitments such as the SDGs, and how these are taken into account in the intervention’s design and implementation. It is important to consider an institution’s commitment to the SDGs at this point, as SDG Target 17.14 under Goal 17 aims to “enhance policy coherence for sustainable development”. This is an important consideration as it encapsulates how both policy alignment and accountability for the SDGs are mainstreamed and implemented in practice.

Looking at implementation in specific context, evaluators should consider coherence with interventions implemented by other actors. For example, how are services provided by a range of actors – are there overlaps or gaps? Coherence considers how the intervention adds value in relation to others and how duplication of effort is avoided.

An evaluation of external coherence should maintain focus on the specific intervention or institution at hand while situating it within the wider context of humanitarian and sustainable development actors. This can include whether interventions are designed within and using existing systems and structures such as coordination mechanisms at the country or sector levels.

Coherence is connected in particular with relevance, effectiveness and impact.

  • While relevance assesses the intervention at the level of the needs and priorities of the stakeholders and beneficiaries that are directly involved, coherence goes up to the next level and looks at the fit of the intervention within the broader system. Both relevance and coherence consider how the intervention aligns with the context, but they do so from different perspectives.

  • Coherence is often a useful angle through which to begin examining unintended effects, which can be captured under effectiveness and impact. While the intervention may achieve its objectives (effectiveness) these gains may be reversed by other (not coherent) interventions in the context.

  • Likewise there are links with efficiency: incoherent interventions may be duplicative, thus wasting resources.

Internal coherence provides a useful lens for considering inclusion, in particular as it relates to human rights commitments, norms and standards. Evaluators can consider the intervention’s compatibility with inclusion and equality norms and standards at a national or institutional level for the implementing institutions and perspectives of local organisations, such as grassroots indigenous peoples’ groups and disabled people’s organisations. Assessment of coherence can provide useful insights into the value and coherence of activities that aim to reduce exclusion, reach marginalised and vulnerable groups, and transform gender inequalities.

Analysis of inclusion in relation to coherence should be considered when evaluators explore the extent to which impact was inclusive and the intervention was relevant, as there are synergies between these three areas of evaluative enquiry.

The table below identifies several of the key challenges when evaluating coherence – including challenges related to breadth of scope, mandate and data availability. Suggestions are made for ways of addressing them for both evaluators and evaluation managers.

Though coherence is a new criterion for the OECD DAC, it has featured in many evaluations over the years. The criterion of coherence is also routinely used in humanitarian evaluations. This section includes a cross section of examples demonstrating how coherence has been evaluated in a strategic evaluation of policy coherence for development in Norway, natural disaster recovery in the Philippines and a country-portfolio evaluation in Montenegro.

Effectiveness helps in understanding the extent to which an intervention is achieving or has achieved its objectives. It can provide insight into whether an intervention has attained its planned results, the process by which this was done, which factors were decisive in this process and whether there were any unintended effects. Effectiveness is concerned with the most closely attributable results and it is important to differentiate it from impact, which examines higher-level effects and broader changes.

Examining the achievement of objectives on the results chain or causal pathway requires a clear understanding of the intervention’s aims and objectives. Therefore, using the effectiveness lens can assist evaluators, programme managers or officers and others in developing (or evaluating) clear objectives. Likewise, effectiveness can be useful for evaluators in identifying whether achievement of results (or lack thereof) is due to shortcomings in the intervention’s implementation or its design.

Under the effectiveness criterion, evaluators should also identify unintended effects. Ideally, project managers will have identified risks during the design phase and evaluators can make use of this analysis as they begin their assessment. An exploration of unintended effects is important both for identifying negative results (e.g. an exacerbation of conflict dynamics) or positive ones (e.g. innovations that improve effectiveness). Institutions commissioning evaluations may want to provide evaluators with guidance on minimum standards for identifying unintended effects, particularly where these involve human rights violations or other grave unintended consequences.

The definition of effectiveness encourages evaluators and managers to ask important questions about the distribution of results across different groups, whether intended or not. This is deliberate and intended to strengthen considerations of equity, which is in line with the SDG policy priority to “leave no one behind”. It encourages evaluators to examine equity issues, whether or not equity is a specific objective of the intervention. Such analysis requires data and may entail an investment of resources – which is often justified because of the valuable insights the evaluation can provide.2

In drawing conclusions about effectiveness, evaluations should concentrate on the results that are most important in the context and for the evaluation audience. The term “relative importance” emphasises the message that one should exercise evaluative judgement and weigh the importance of the achieved/unachieved objectives and results, including unintended consequences, when drawing conclusions about effectiveness.

The definition of effectiveness includes the key concepts of: achievement of objectives, the varying importance of objectives and results, differential results across groups and understanding the factors that influence outcomes.

The primary focus of assessing effectiveness remains on establishing whether an intervention has achieved its intended results at different levels of the results chain (usually outputs and outcomes but also in some cases impacts). The results chain should be specified as part of the design of the intervention and is the key reference point for management, monitoring and evaluation.

It is very difficult to assess the effectiveness of an activity if the stated objectives or planned results of the activity are vague or ambiguous or have shifted during the course of the intervention without it being updated or restructured. Intervention managers should at least explain why goals have changed and what the new goals are. If this has not been done, evaluators will need to consult intervention documents or interview stakeholders to recreate the logic underpinning changes in the intervention over time. Based on the reconstructed logic, evaluators can then judge the extent to which the new objectives were relevant and effectively reached.

Evaluating effectiveness is also important in adaptive programmes where changes are made iteratively, based on feedback from stakeholders, emerging results and changes in context. In adaptive programmes, the design and implementation of an intervention may go through numerous incremental changes over time. In these situations, it is important for evaluators to reflect on and review theories of change with reference to wider systems in which an intervention is located and take into account any records showing how and why changes have been made. The evaluation of effectiveness should reflect current objectives. Reviewing the logic and need for changes to implementation or objectives (often captured in updated theories of change or results frameworks) should inform evaluations of effectiveness.

Evaluating effectiveness may involve establishing observable changes in the target group or environment over the intervention’s implementation as well as establishing causality of the observed changes at different levels, i.e. showing that the changes were caused by the intervention or that the intervention contributed to the changes as opposed to other environmental factors or, alternatively, another intervention. Methodologies should be designed to allow the evaluator to draw out how results came about and what the reasons (explanatory factors) were for achievement, underachievement or non-achievement.

Evaluating effectiveness includes examining the intervention's results. Results is defined by the OECD DAC as the intervention's output, outcome or impact (intended or unintended, positive and/or negative) (OECD, 2002[10]). Therefore evaluating effectiveness may also include the assessment of any unintended effects, both positive and negative that have occurred as a result of the intervention. The implementation of interventions always has the potential to cause unintended social, economic or environmental effects, or may cause effects that are not intended but could have been foreseen. Therefore, evaluations should be careful to consider effects that fall outside of those specified in the intervention objectives. This can also extend to examining the potential benefits or risks arising from these unintended (predictable or unpredictable) effects. The extent to which the intervention contributed to the realisation of national or other relevant development goals and objectives in the context also falls under effectiveness – while the potential for these contributions will be examined under relevance.

When evaluating effectiveness, evaluators explore the achievement (or lack of achievement) of the various objectives and results of the intervention. If some – but not all – of the objectives were achieved the evaluators will need to examine their relative importance to draw conclusions on the effectiveness. This may draw on the analysis of relevance, which should have addressed the potential differences between the priorities and needs of the various stakeholders. This implies that evaluators may conclude the intervention was effective in some ways but not others, or effective from the perspective of some stakeholders, but not others.

Evaluators should consider inclusiveness and equity of results amongst beneficiary groups – whether the beneficiaries are individuals or institutions. Understanding differential results can include looking at the extent to which the intervention ensured an inclusive approach in design and implementation. For example, an evaluation could examine the process through which the intervention’s objectives were formulated. This includes whether the objectives were formulated based on a needs analysis and consultation process amongst stakeholders (including the main target group). Through this process, insight may also be gained into whether the intervention missed any opportunities to generate results for its target population or beneficiaries, including contributing to longer term change, such as reduction in inequalities. Evaluators may examine unintended or unexpected results as well as the intended result, taking into account the fact that certain changes and transformations are subtle and/or long term and may be difficult to quantify.

Examining the factors that influence results is important because it helps evaluators to understand why an intervention may or may not have achieved its goals, which helps partners identify areas for improvement. Factors may be internal to the intervention or external. Such factors might include those related to: management, human resources, financial aspects, regulatory aspects, implementation modifications or deviation from plans. Quality of implementation (and adherence to implementation protocols) is often a driving factor of effectiveness, and should be described before evaluating effectiveness, efficiency, impact and sustainability.

Externally, evaluators should consider positive and negative effects arising from the intervention’s context, which in turn contribute to achievement or non-achievement of results. This can include assessing the intervention’s adaptive capacity in response to contextual changes. Evaluations can also examine the timeliness of results (e.g. phasing of support to target groups or environments which aided delivery of results).

Effectiveness is linked with other criteria, particularly relevance and impact:

  • Under relevance, the objectives of the intervention are identified; progress towards these objectives is determined by effectiveness. It is of course possible that an intervention that is not relevant, is nonetheless delivered effectively. In the case of such a disconnect, evaluators will need to use judgement when drawing conclusions overall, as one cannot simply ignore findings from one criterion in favour of another.

  • Effectiveness and impact are complementary criteria focusing on different levels of the results chain. Effectiveness considers the achievement of results relative to an intervention’s objectives, namely at the output and outcome level whereas impact focuses on higher-level results, namely what the declared higher-level results are and what contributes to these. In general, intervention managers and evaluations should ensure that a clear distinction is made between the different results levels (i.e. input, output, outcome and impact) and that it is clear which aspects will be evaluated under each criterion.

The definition of effectiveness encourages an in-depth consideration of equity between different groups. Evaluators should assess how inclusive the intervention has been for different beneficiary groups and how key principles such as equity, non-discrimination and accountability have been incorporated at all stages, from design through to results. In accordance with “leave no one behind” particular attention should be given to the extent to which the intervention has met the needs of the most marginalised. It is important to examine the achievement and distribution of results in relation to various beneficiary groups and explain any differences.

Moreover, evaluators should consider if, how and why results contribute to tackling inequality. Under this criterion evaluators should examine how specific activities impact the welfare of specific groups and whether these activities provide participants with opportunities for empowerment.

The table below identifies challenges related to data, formulation of objectives and attribution of results, and suggests ways of addressing them for both evaluators and evaluation managers.

This section includes a cross section of examples demonstrating how effectiveness has been evaluated in regards to electoral assistance and a country programme evaluation in Cabo Verde.

This criterion is an opportunity to check whether an intervention’s resources can be justified by its results, which is of major practical and political importance. Efficiency matters to many stakeholder groups, including governments, civil society and beneficiaries. Better use of limited resources means that more can be achieved with development co-operation, for example in progressing towards the SDGs where the needs are huge. Efficiency is of particular interest to governments that are accountable to their taxpayers, who often question the value for money of different policies and programmes, particularly decisions on international development co-operation, which tends to be more closely scrutinised.

Operationally, efficiency is also important. Many interventions encounter problems with feasibility and implementation, particularly with regard to the way in which resources are used. Evaluation of efficiency helps to improve managerial incentives to ensure programmes are well conducted, holding managers to account for how they have taken decisions and managed risks.

There are several important assumptions and points to note:

  • Resources should be understood in the broadest sense and include full economic costs (human, environmental, financial and time). It is not the same as the programme budget or the money spent.

  • Results should also be understood in a comprehensive sense, covering the whole of the results chain: outputs, outcomes and impacts. Depending on the type of evaluation, some organisations associate efficiency with outputs only; however, the criteria is defined and conceptualised here to encourage evaluating efficiency also in relation to higher-level effects such as impacts, though this can often be challenging.

  • Evaluability: The ability to assess effectiveness, impact, coherence and sustainability affects what can be said about efficiency.

  • Efficiency is about choices between feasible alternatives that can deliver similar results within the given resources. Before cost-effectiveness comparisons can be made, alternatives must be identified that are genuinely feasible and comparable in terms of quality and results.

For these reasons, efficiency analysis should be firmly grounded in analysis of the context, since for a given example it may be more costly to reach the intended beneficiaries but also more important and justifiable in terms of development impacts.

Evaluating efficiency involves looking at the key areas of economic efficiency, operational efficiency, and timeliness.

This is the primary element for analysing efficiency. Economic efficiency is used here to refer to the absence of waste and the conversion of inputs into results in the most cost-efficient way possible. It includes assessing the efficiency of results at all levels of the results chain: outputs, outcomes and impacts. This also involves evaluating the extent to which appropriate choices were made and trade-offs addressed in the design stage and during implementation. These choices include the way that resources were allocated between target groups and time periods, as well as the options that were available for purchasing inputs according to market conditions.

Operational efficiency is also an important element to consider. It deals with how well resources are used during implementation. Questions to help explore operational efficiency include: Were the human and financial resources used as planned and appropriately and fully utilised (or were resources misallocated, budgets underspent, overspent)? Were resources redirected as needs changed? Were risks managed? Were decisions taken which helped to enhance efficiency in response to new information? Were the logistics and procurement decisions optimal?

Closely related to both economic and operational efficiency, timeliness starts by asking whether and to what extent the results were achieved within the intended timeframe. It is also the opportunity to check if the timeframe was realistic or appropriate in the first place. In addition, was it reasonably adjusted during the intervention, given that for many interventions external factors and changes to the programme are likely? Evaluators must assess if efforts were made to overcome obstacles and mitigate delays in how the intervention was managed, as the situation evolved.

As already noted, the different criteria are connected and should be seen as alternative lenses for looking at the intervention, rather than rigid boundaries. Some of the interconnections with other criteria are:

  • Relevance and efficiency: A key aspect of operational relevance is whether the intervention design responded well to the context allowing for considerations of feasibility and capacity. In practical terms, whether the design was feasible and could be implemented also has a direct effect on efficiency. Thus, in this specific aspect the evaluator may end up looking at both issues together.

  • Efficiency and results: Since efficiency involves assessing to what extent the resources used were converted into results, all aspects of results (i.e. questions arising when assessing effectiveness, impact and sustainability) should be considered. Operational efficiency is closely related to effectiveness and impact. Often, looking at how well things are working within an intervention involves looking at effectiveness and efficiency simultaneously. This is particularly true, for instance, when identifying bottlenecks and how to address them, or ensuring resources are allocated to where they are needed.

Through the lens of the efficiency criterion, evaluators can understand how inclusion is integrated and understood in the intervention’s management and the extent to which resource use reflects differential experiences and results for different people. The cost of achieving results often varies across beneficiaries, with those “furthest behind” being the most difficult – and expensive – to reach. Analysis of efficiency should therefore be infused with a clear understanding of inequalities and power dynamics in the context, as well as an understanding of how the intervention fits with the need for transformational change to address underlying inequalities. Efficiency analysis is a key place to consider whether or not a commitment to the “leave no one behind” agenda (and the 2030 Agenda aim of achieving transformational change for marginalised groups) has been meaningfully and effectively operationalised across management, decision making and resource allocation.

Here, analysis can include how and why resources are allocated between the different groups being targeted by an intervention and the extent to which resource allocation was based on needs and engagement with marginalised groups. Evaluators can consider if inclusive and equitable results are achieved at a reasonable cost, how “reasonable cost” is defined and determined and how such a cost varies between different groups of beneficiaries. For instance, if the intervention commits to reaching specific groups, are sufficient resources allocated and justified so as to do this successfully?

Understanding whose voices are heard and listened to when decisions are made about how policies are designed, how funds are spent and who has control and oversight of these processes is a key consideration. When intervention logic and plans include changing unequal structures and power dynamics, evaluations should consider the extent to which they have been successful or whether they have unintentionally reinforced existing unequal structures and dynamics.

It is also important to consider whether interventions collect relevant, disaggregated monitoring data to enable implementers to take relevant decisions on the focus of activities/objectives and resources allocated to inclusive development.

The appropriate way of applying the efficiency criterion will depend entirely on the nature of the intervention itself and will be different for projects, programmes, country programmes and policies. The following example from the Dutch Ministry of Foreign Affairs shows an application of the criterion at the policy level within the water sector, where efficiency is understood largely in terms of co-ordination and practical aspects of planning and partnerships across a complex set of relationships (Box 4.11).

Table 4.4 identifies several of the key challenges when evaluating efficiency and suggests ways of addressing them for both evaluators and evaluation managers.

A basic decision is whether and how to use traditional economic measures and related tools such as cost-benefit analysis, rates of return, cost-effectiveness analysis, benchmarking comparisons, etc. to evaluate efficiency.3 This depends on the purpose of the evaluation, the intervention and intended results, feasibility, available data/resources, and the intended audience.

The usefulness of different analytical tools will also depend on what approach was used at the design/approval stages within the relevant institution, as this will have major implications for the availability of information required to undertake different types of analysis. Within multilateral development banks and in some public sector capital investment programmes, very clear guidance is provided on economic, social and environmental appraisal ex ante and increasingly on applying gender analysis to such projects and programmes. It makes sense to use the same tools for assessing efficiency during project appraisal or approval (generally ex ante, before the intervention is implemented) as those used during evaluation. The policy rules and guidance adopted by the institution will also partly determine what data and indicators are available to the evaluator (e.g. whether rates of return were estimated during the economic appraisal, if one exists, and whether alternatives were identified).

This section includes a cross section of examples demonstrating how efficiency has been evaluated in cases of rural electrification, agricultural input subsidies and a significant portfolio of work on water management.

The impact criterion encourages consideration of the big “so what?” question. This is where the ultimate development effects of an intervention are considered – where evaluators look at whether or not the intervention created change that really matters to people. It is an opportunity to take a broader perspective and a holistic view. Indeed, it is easy to get absorbed in the day-to-day aspects of a particular intervention and simply follow the frame of reference of those who are working on it. The impact criterion challenges evaluators to go beyond and to see what changes have been achieved and for whom. The importance of this is highlighted in the Swedish International Development Cooperation Agency’s (Sida) Evaluation Manual (Molund and Schill, 2004[17]):

“The impact criterion provides an important corrective to what could otherwise become an overly narrow preoccupation with the intentions of those who plan and manage development interventions and a corresponding neglect of the perspectives of target groups and other primary stakeholders.”

Although the use of the word impact is commonplace, it is important to note that there is often confusion in how it is understood, which could affect how stakeholders understand the evaluation. First, in a political context it can be used loosely to mean “results” in the broadest sense, encompassing both effectiveness and impact as defined here, as well as other aspects of performance. Second, in recent years it has often been confused with the term “impact evaluation”, referring to specific methodologies for establishing statistically significant causal relationships between the intervention and observed effects.4 When used in this way, impact may refer to results anywhere along the results chain, including outputs, and almost always refers to desired effects. For these reasons, it is important to clarify with stakeholders how they understand the term at the outset and explain how it is being used in the evaluation context as a criterion to examine higher-level effects.

Questions that the impact criterion might cover include:

  • Has the intervention caused a significant change in the lives of the intended beneficiaries?

  • How did the intervention cause higher-level effects (such as changes in norms or systems)?

  • Did all the intended target groups, including the most disadvantaged and vulnerable, benefit equally from the intervention?

  • Is the intervention transformative – does it create enduring changes in norms – including gender norms – and systems, whether intended or not?

  • Is the intervention leading to other changes, including “scalable” or “replicable” results?

  • How will the intervention contribute to changing society for the better?

The definition of impact includes the key concepts of higher-level effects, significance, differential impacts, unintended effects and transformational change.

The impact criterion captures the “so what?” question of an evaluation. It examines the significance of the intervention and its higher-level results, meaning how much it mattered to those involved.

The definition is intended to encourage evaluators to consider different perspectives, according to the setting. The evaluator should think carefully about the context, as well as the needs and priorities of the intended beneficiaries of the intervention, the agreed policy goals of the relevant institutions and the nature of the intervention itself. This element for analysis can also be applied when considering an intervention’s unintended results.

In assessing “significance”5, evaluators should be aware of the importance of considering different perspectives and using a systematic approach informed by the needs of stakeholders. They should also take measures to keep their (implicit) biases and value judgements from affecting their evaluation of the intervention's significance.

In keeping with the SDG remit to “leave no one behind” and to safeguard human rights, including gender equality, assessing the differential impacts is important. Positive impacts overall can hide significant negative distributional effects. It is essential to consider this at the evaluation design stage, or indeed at the intervention design stage, to ensure that impact by target group can be monitored and then evaluated. This requires early planning in design and evaluation to ensure that disaggregated data is available where feasible and may also involve looking at a range of parameters around exclusion/inclusion. It will involve a granular analysis of disaggregated data where available.

Evaluators should consider if an intervention has unintended effects. This analysis should include the extent to which impacts were intended or envisaged when the intervention was designed. Unintended effects can be positive or negative. Where they are positive, evaluators should consider their overall significance and whether there is scope for innovation or scaling or replication of the positive impact on other interventions. Evaluators should pay particular attention to negative impacts, particularly those that are likely to be significant including – but not limited to – environmental impacts or unintended impacts on vulnerable groups.

The definition defines transformational change as “holistic and enduring changes in systems or norms”. Transformational change can be thought of as addressing root causes, or systemic drivers of poverty, inequalities, exclusion and environmental damage, and is recognised in the 2030 Agenda as necessary to achieving the sustainable development goals. It is becoming more and more common for interventions to aim at contributing to transformational change and evaluators are increasingly called upon to answer questions about effects on norms and systems (social, economic, or political systems), when assessing the impact criterion. For example, an evaluation might examine the extent to which traditional gender roles have been modified in some way (see Box 4.12).

As with the other criteria, the impact criterion interacts conceptually with other criteria:

  • Impact and effectiveness: Impact and effectiveness both consider what results are achieved by the intervention, effects on beneficiaries and unintended results. The difference between the two criteria will largely depend on how the intervention and its objectives were designed. Effectiveness will generally focus on the achievement of stated objectives (at any level of the results chain); impact will always focus on higher-level effects that would not otherwise automatically be considered (because they were not included as objectives). Another way to think about the distinction between the two is that over time many interventions may be rated as effective, but still not “add up” to the desired higher-level or transformational change. If impact is not evaluated, these trends will be missed. Articulations of the intended results of an intervention will vary with different institutions often having different requirements for defining the results chain. In some institutions, every intervention is required to link to higher-level goals, while in others only immediate effects are considered. In applying the two criteria, it will be useful for institutions to ensure requirements for intervention design are clear and coherent. Where smaller interventions, such as projects, do not routinely articulate how they link to higher-level goals, it is important for the evaluation policy to mandate analysis of impact – otherwise higher-level impacts will be missed (see Box 4.12 for an example).

  • Impact and coherence: The fact that impact involves taking a holistic perspective means that it naturally fits well with considerations of coherence, as the effects achieved by an intervention almost always depend on other interventions, policy goals, trade-offs and the systems in which the intervention takes place. One example would be development co-operation programmes that support strengthening public health systems in developing countries, where the impact is affected not only by the programme, but also by domestic or global policies on pricing and regulation for pharmaceuticals or recruitment of health workers.

  • Impact and efficiency: As noted under the efficiency criterion, in order to look at efficiency in the broadest sense, evaluators need to consider a holistic picture of the results achieved (e.g. impact, sustainability) and compare results with the resources.

  • Impact and sustainability: Impact and sustainability both consider, to some extent, whether results will endure over time. Impact focuses on the time dimension in terms of examining transformational changes (which are enduring by nature). Sustainability looks at the continuation of benefits. As a criterion, sustainability is broader because it considers the conditions for achieving sustainability and the links between an intervention’s economic, social and environmental sustainability.

Transformational change, differential impact and significance are all intrinsically linked with inclusion. It is important here to understand what impact has occurred and for whom. Have there been meaningful contributions to transforming systems of oppression and could this lead to lasting change for marginalised and vulnerable groups? Evaluators should aim to understand the extent to which these unintended impacts stem from structural inequality within wider systems and the impact of interventions on these systems.

The revised definition of impact emphasises the high-level results of an intervention, including the long-term social and economic impacts. It encompasses transformative change to systems and norms including, as noted in the definition, “potential effects on people’s wellbeing, human rights, gender equality and the environment”. Impact is where evaluators can see the bigger picture of how an intervention contributes and adds to transformational change, equity, human rights and empowerment.

Of all the six criteria, impact is the one that can often be the most challenging to evaluate and understand. Four of the main challenges involved, together with some suggestions on how to proceed, are summarised below.

As a guiding rule, evaluating impact typically requires more resources and considerably more primary data than for the other criteria, and should only be built into an evaluation design when these resources are likely to be available or can be obtained for this purpose. On the other hand, because it focuses on if and how the intervention has made a difference, this is the area that will often receive the most attention from users. Accordingly, the investment in time and effort to include it in the evaluation is often justifiable.

An additional challenge when evaluating impact is related to the deadlines set by the institutions that commission evaluations. These deadlines are often the same as the closing date of the intervention. This requirement could be more flexible to allow impact to be examined over a longer period of time enabling better understanding of those impacts which may only become evident after an intervention has finished.

The table below identifies several of the key challenges when evaluating impact and suggests ways of addressing them for both evaluators and evaluation managers.

This section includes a cross section of examples demonstrating how impact has been evaluated for interventions related to family empowerment, peacebuilding, research support, land use planning, and violence against women and girls.

Assessing sustainability allows evaluators to determine if an intervention’s benefits will last financially, economically, socially and environmentally. While the underlying concept of continuing benefits remains, the criterion is both more concise and broader in scope than the earlier definition of this criterion.6 Sustainability encompasses several elements for analysis – financial, economic, social and environmental – and attention should be paid to the interaction between them.

Confusion can arise between sustainability in the sense of the continuation of results, and environmental sustainability or the use of resources for future generations. While environmental sustainability is a concern (and may be examined under several criteria, including relevance, coherence, impact and sustainability), the primary meaning of the criteria is not about environmental sustainability as such; when describing sustainability, evaluators should be clear on how they are interpreting the criterion.

Sustainability should be considered at each point of the results chain and the project cycle of an intervention. Evaluators should also reflect on sustainability in relation to resilience and adaptation in dynamic and complex environments. This includes the sustainability of inputs (financial or otherwise) after the end of the intervention and the sustainability of impacts in the broader context of the intervention. For example, an evaluation could assess whether an intervention considered partner capacities and built ownership at the beginning of the implementation period as well as whether there was willingness and capacity to sustain financing at the end of the intervention. In general, evaluators can examine the conditions for sustainability that were or were not created in the design of the intervention and by the intervention activities and whether there was adaptation where required.

Evaluators should not look at sustainability only from the perspective of donors and external funding flows. Commissioners should also consider evaluating sustainability before an intervention starts, or while funding or activities are ongoing. When assessing sustainability, evaluators should: 1) take account of net benefits, which means the overall value of the intervention’s continued benefits, including any ongoing costs, and 2) analyse any potential trade-offs and the resilience of capacities/systems underlying the continuation of benefits. There may be a trade-off, for example, between the fiscal sustainability of the benefits and political sustainability (maintaining political support).

Evaluating sustainability provides valuable insight into the continuation or likely continuation of an intervention’s net benefits in the medium to longer term, which has been shown in various meta-analyses to be very challenging in practice. For example in sectors such as water and sanitation, or intervention types such as community driven development, benefits often fade out after some time. The role of evaluation here can be to scrutinise assumptions in the theory of change for how sustainability is achieved (Mansuri and Rao, 2013[23]; White, Menon and Waddington, 2018[24]).

If these various aspects of sustainability are carefully considered by an evaluation, it can lead to important insights into how interventions can plan and implement for change that ensures sustainable development in the future. The lessons may highlight potential scalability of the sustainability measures of the intervention within the current context or the potential replicability in other contexts.

A key aspect of sustainability is exit planning. Evaluations should assess whether an appropriate exit strategy has been developed and applied, which would ensure the continuation of positive effects including, but not limited to, financial and capacity considerations. If the evaluation is taking place ex post, the evaluator can also examine whether the planned exit strategy was properly implemented to ensure the continuation of positive effects as intended, whilst allowing for changes in contextual conditions as in the examples below.

A useful resource for further understanding sustainability and addressing common challenges is a meta-evaluation (Noltze, Euler and Verspohl, 2018[25]) and evaluation synthesis (Noltze, Euler and Verspohl, 2018[26]) recently conducted by the German Institute for Development Evaluation (Deval). The two studies highlight the various elements for analysis of sustainability that can be examined in an evaluation. The meta-evaluation makes a strong case for the evaluation of sustainability that incorporates the principles of the SDGs, highlighting the areas in which this can add value to an evaluation. This includes an analysis of how identifying and assessing the unintended effects of a project and the interactions or trade-offs between the different dimensions of sustainability can support learning and accountability when applying the sustainability criterion.

To understand the definition of sustainability involves understanding the components of the enabling environment, the continuation of positive effects, and risks and trade-offs.

Evaluations can consider how an intervention contributed to improving the enabling environment for development in multiple ways, including how the intervention ensured the strengthening of systems, institutions or capacities to support future development or humanitarian activity. This encourages evaluations to consider the development partner capacity that has been built or strengthened as a result of the intervention, as well as the resilience built to absorb external changes and shocks. This will ensure that the net benefits, as discussed earlier, continue into the future.

To provide examples of how the enabling environment for development can be improved, contributions of an intervention could include: capacities strengthened (at the individual, community, or institutional level); improved ownership or political will; increased national (and where applicable subnational) financial or budgetary commitments; policy or strategy change; legislative reform; institutional reforms; governance reforms; increased accountability for public expenditures; or improved processes for public consultation in development planning.

Sustainability can be evaluated over different timeframes. Evaluators can assess for both actual sustainability (i.e. the continuation of net benefits created by the intervention that are already evident) and prospective sustainability (i.e. the net benefits for key stakeholders that are likely to continue into the future). Evaluators should carefully consider appropriate evaluation approaches to assess actual and/or prospective sustainability, depending on the timing of the evaluation and the timescale of intended benefits. Many higher-level changes will take many years or decades to be fully realised.

In terms of evaluating actual sustainability, the evaluator can examine the extent to which any positive effects generated by the intervention demonstrably continued for key stakeholders, including intended beneficiaries, after the intervention has ended. Evaluators can also examine if and how opportunities to support the continuation of positive effects from the intervention have been identified, anticipated and planned for, as well as any barriers that may have hindered the continuation of positive effects. This can support findings that demonstrate adaptive capacity in an intervention where it was required.

Examining prospective sustainability entails a slightly different approach. An evaluation examining the future potential for sustainability would assess how likely it is that any planned or current positive effects of the intervention will continue, usually assuming that current conditions hold. The evaluation will need to assess the stability and relative permanence of any positive effects realised, and conditions for their continuation, such as institutional sustainability, economic and financial sustainability, environmental sustainability, political sustainability, social sustainability and cultural sustainability.

Assessing sustainability involves looking not only at the likelihood of continued positive effects but also an examination of the potential risks and ongoing costs associated with an intervention. Therefore, evaluation managers should consider the factors that may enhance the sustainability of net benefits over time as well as factors that may inhibit sustainability. Examining the risks related to the sustainability of an intervention can involve assessing the extent to which there are identifiable or foreseeable positive or negative contextual factors that may influence the durability of the intervention’s results.

This also raises the issue of trade-offs, an important element of the revised criteria. Assessing the trade-offs associated with an intervention encourages examination of the trade-off between instant impact and potential longer-term effects or costs as well as the trade-offs between financial, economic, social and environmental aspects. For instance, an evaluation may find that an intervention supported economic growth but that this growth is unsustainable due to its major environmental costs that may negatively impact longer-term economic growth. This is in line with the SDG definition of sustainable development and broadens the scope for evaluations to examine sustainability beyond just the likelihood of continued positive effects from an intervention.

Sustainability is closely linked with the other criteria.

  • Sustainability is linked to relevance, with the level of relevance to key stakeholders being a key factor affecting their ownership and buy-in to eventual benefits, which in turn drive sustainability.

  • Likewise, coherence can provide useful insights on sustainability, as it looks at other interventions in a given context, which could support, or undermine, the intervention’s benefits over time.

  • Effectiveness and impact: The evaluation of the continuation of results relies firstly on results having been achieved (effectiveness) and secondly, that higher-level effects were demonstrated (impact). Therefore, effectiveness and impact can be seen as overriding criteria for sustainability because if their analysis does not show the intervention achieving outputs, outcomes or impacts, there will be no clear benefits to sustain. Box 4.19 provides an example of how impact and sustainability can be examined together. Considering synergies between impact, effectiveness and sustainability by evaluating conditions that are sufficient and necessary for results to continue after the intervention has finished enables evaluators to explore effectiveness and impact over the longer term.

  • Efficiency concerns may also undermine sustainability of benefits, for example when short-term costs drive decision making to the detriment of longer term effects, sustainability may be lessened.

The revised definition of sustainability and its note draw attention to the “financial, economic, environmental and social” dimensions of sustainability and how these support the ongoing and long-term benefits of the intervention’s results. Evaluators should consider how the continuation of benefits for different groups of beneficiaries has been planned for and, if the evaluation is taking place ex post, how this is manifest for these different groups. Here, there should be a focus on the “leave no one behind” principle and how marginalised groups experience ongoing positive benefits as well as trade-offs that may occur between different groups.

It is also relevant for evaluators to consider the extent to which the intervention has built an enabling environment for inclusive and equitable development, addressing underlying systemic issues (“treating the illness, not just symptoms”) under both the impact and sustainability criteria. Questions of ownership and gender empowerment are important here. Sustainability of systems requires increased capacity so evaluators should understand whose capacity has been built and how this relates to existing unequal systems and structures. Is there both capacity and commitment from different stakeholder groups to create and uphold an enabling environment for gender equality and women’s empowerment over the medium to long term? If not, what are the barriers, and are these within the scope of the intervention?

The table below identifies challenges when evaluating sustainability related to timing, the lack of positive effects and other factors affecting sustainability, and suggests ways of addressing them for both evaluators and evaluation managers.

This section gives examples from rural development in Afghanistan, general budget support and maternal health, to demonstrate how sustainability has been evaluated.


[11] Arghiros, D. et al. (2017), Making it Count: Lessons from Australian Electoral Assistance 2006-16, Australian Government Department of Foreign Affairs and Trade, http://www.oecd.org/derec/australia/australia-electoral-assistance-2006-2016.pdf (accessed on 11 January 2021).

[15] Baltzer, K. and H. Hansen (2011), Agricultural Input Subsidies in Sub-Saharan Africa: Evaluation Study, DANIDA, International Development Cooperation, Ministry of Foreign Affairs of Denmark,, http://www.oecd.org/derec/49231998.pdf (accessed on 11 January 2021).

[31] Befani, B. and J. Mayne (2014), “Process Tracing and Contribution Analysis: A Combined Approach to Generative Causal Inference for Impact Evaluation”, IDS Bulletin, Vol. 45/6, pp. 17-36, https://doi.org/10.1111/1759-5436.12110.

[34] Belcher, B. and M. Palenberg (2018), “Outcomes and Impacts of Development Interventions”, American Journal of Evaluation, Vol. 39/4, pp. 478-495, https://doi.org/10.1177/1098214018765698.

[19] Bryld, E. (2019), Evaluation of Sida’s Support to Peacebuilding in Conflict and Post-Conflict Contexts: Somalia Country Report, Sida, https://publikationer.sida.se/contentassets/1396a7eb4f934e6b88e491e665cf57c1/eva2019_5_62214en.pdf (accessed on 11 January 2021).

[35] Chambers, R. et al. (2009), “Designing impact evaluations: different perspectives”, No. 4, 3ie, https://www.3ieimpact.org/evidence-hub/publications/working-papers/designing-impact-evaluations-different-perspectives (accessed on 11 January 2021).

[18] Economía Urbana and IPSOS (2019), Evaluación de operaciones del programa “más familias en acción” y de resultados del componente de bienestar comunitario, [Evaluation of the "More Families in Action" programme], Departamento Nacional de Planeación, Bogotá D.C., https://colaboracion.dnp.gov.co/CDT/Sinergia/Documentos/Evaluacion_MFA_Informe_Resultados.pdf (accessed on 15 January 2021).

[5] Eurecna Spa (2020), Bolivia - Evaluation of Health Initiatives (2009-2020), Italian Ministry of Foreign Affairs and International Cooperation, http://www.oecd.org/derec/italy/evaluation-report-of-health-initiatives-in-Bolivia-2009_2020.pdf (accessed on 11 January 2021).

[4] FAO (2020), Evaluation of “Improving farmer livelihoods in the dry zone through improved livestock health, productivity and marketing”, Food and Agriculture Organization of the United Nations, Rome, http://www.fao.org/3/ca8463en/ca8463en.pdf (accessed on 11 January 2021).

[30] Gertler, P. et al. (2016), Impact Evaluation in Practice: Second Edition, World Bank Group, Washington D.C., https://publications.iadb.org/en/impact-evaluation-practice-second-edition (accessed on 12 January 2021).

[8] Global Affairs Canada (2019), Evaluation of Natural Disaster Reconstruction Assistance in the Philippines, 2013-14 to 2018-19, Global Affairs Canada, https://www.international.gc.ca/gac-amc/publications/evaluation/2019/endra-earcn-philippines.aspx?lang=eng (accessed on 12 January 2021).

[22] Hallman, K. et al. (2018), Girl Empower Impact Evaluation: Mentoring and Cash Transfer Intervention to Promote Adolescent Wellbeing in Liberia, International Rescue Committee, https://www.rescue.org/sites/default/files/document/4346/girlempowerimpactevaluation-finalreport.pdf (accessed on 12 January 2021).

[29] ICAI (2018), Assessing DFID’s Results in Improving Maternal Health: An Impact Review, The Independent Commission for Aid Impact, https://icai.independent.gov.uk/wp-content/uploads/ICAI-review-Assessing-DFIDs-results-in-improving-Maternal-Health-.pdf (accessed on 11 January 2021).

[2] IDD and Associates (2006), A Joint Evaluation of General Budget Support Evaluation of General Budget Support: Synthesis Report, https://www.oecd.org/development/evaluation/dcdndep/37426676.pdf (accessed on 11 January 2021).

[12] IDEV (2018), Cabo Verde: Evaluation of the Bank’s Country Strategy and Program 2008–2017: Summary Report, Independent Development Evaluation, African Development Bank, http://www.oecd.org/derec/afdb/AfDB-2008-2017-cabo-verde-bank-strategy.pdf (accessed on 11 January 2021).

[14] IEG (2008), The Welfare Impact of Rural Electrification: A Reassessment of the Costs and Benefits, World Bank, https://doi.org/10.1596/978-0-8213-7367-5.

[16] IOB (2017), Tackling major water challenges: Policy review of Dutch development aid policy for improved water management, Policy and Operations Evaluation Department, Ministry of Foreign Affairs, Netherlands, https://english.iob-evaluatie.nl/publications/policy-review/2017/12/01/418-%E2%80%93-iob-%E2%80%93-policy-review-of-dutch-development-aid-policy-for-improved-water-management-2006-2016-%E2%80%93-tackling-major-water-challenges (accessed on 12 January 2021).

[33] Leeuw, F. and J. Vaessen (2009), Impact Evaluations and Development: NoNIE Guidance on Impact Evaluation, The Networks on Impact Evaluation, Washington D.C., http://search.oecd.org/dac/evaluation/dcdndep/47466906.pdf (accessed on 12 January 2021).

[21] Leppert, G. et al. (2018), Impact, Diffusion and Scaling-Up of a Comprehensive Land-Use Planning Approach in the Philippines: From Development Cooperation to National Policies, German Institute for Development Evaluation (DEval), Bonn, https://www.deval.org/files/content/Dateien/Evaluierung/Berichte/2018/Zusammenfassung_Deutsch_%20DEval-2018_Philippinen_final_web-2.pdf (accessed on 12 January 2021).

[23] Mansuri, G. and V. Rao (2013), Localizing Development, The World Bank, https://doi.org/10.1596/978-0-8213-8256-1.

[6] Ministry of Foreign Affairs Japan (2019), Japan ODA Evaluation Guidelines, https://www.mofa.go.jp/policy/oda/evaluation/basic_documents/pdfs/guidelines11th.pdf (accessed on 18 February 2021).

[17] Molund, S. and G. Schill (2004), Looking Back, Moving Forward Sida Evaluation Manual, Sida, https://www.oecd.org/derec/sweden/35141712.pdf (accessed on 11 January 2021).

[26] Noltze, M., M. Euler and I. Verspohl (2018), Evaluation Synthesis of Sustainability in German Development Cooperation, German Institute for Development Evaluation (DEval), Bonn, http://www.deval.org/files/content/Dateien/Evaluierung/Berichte/2018/DEval_Evaluierungssynthese_EN_web.pdf (accessed on 12 January 2021).

[25] Noltze, M., M. Euler and I. Verspohl (2018), Meta-Evaluation of Sustainability in German Development Cooperation, German Institute for Development Evaluation (DEval), Bonn, http://www.deval.org/files/content/Dateien/Evaluierung/Berichte/2018/DEval_NH_Meta-Evaluierung_EN_web.pdf (accessed on 12 January 2021).

[7] Norad (2018), Evaluation of Norwegian Efforts to Ensure Policy Coherence for Development, Norad, https://www.norad.no/contentassets/4ac3de36fbdd4229811a423f4b00acf7/8.18-evaluation-of-norwegian-efforts-to-ensure-policy-coherence-for-development.pdf (accessed on 11 January 2021).

[3] OECD (2011), Evaluating Budget Support: Methodological Approach, DAC Network on Development Evaluation, OECD Publishing, Paris, https://www.oecd.org/dac/evaluation/dcdndep/Methodological%20approach%20BS%20evaluations%20Sept%202012%20_with%20cover%20Thi.pdf (accessed on 12 January 2021).

[1] OECD (2010), Quality Standards for Development Evaluation, DAC Guidelines and Reference Series, OECD Publishing, Paris, https://dx.doi.org/10.1787/9789264083905-en.

[10] OECD (2002), Evaluation and Aid Effectiveness No. 6 - Glossary of Key Terms in Evaluation and Results Based Management (in English, French and Spanish), OECD Publishing, Paris, https://dx.doi.org/10.1787/9789264034921-en-fr.

[28] Orth, M., M. Birsan and G. Gotz (2018), The Future of Integrated Policy-Based Development Cooperation: Lessons from the Exit from General Budget Support in Malawi, Rwanda, Uganda and Zambia, German Institute for Development Evaluation (DEval) , Bonn, http://www.deval.org/files/content/Dateien/Evaluierung/Berichte/2018/DEval_EN_The%20Future%20of%20Integrated%20Policy-Based%20Development%20Cooperation..pdf (accessed on 12 January 2021).

[13] Palenberg, M. (2011), BMZ: Tools and Methods for Evaluating the Efficiency of Development Interventions | Managing for Sustainable Development Impact, BMZ Evaluation Division – German Federal Ministry for Economic Cooperation and Development, http://www.managingforimpact.org/resource/bmz-tools-and-methods-evaluating-efficiency-development-interventions (accessed on 12 January 2021).

[20] PEM Consult (2020), Evaluation of the Danish Strategic Sector Cooperation, Evaluation, Learning and Quality Department, Ministry of Foreign Affairs/Danida, Denmark, https://um.dk/en/danida-en/results/eval/eval_reports/publicationdisplaypage/?publicationID=CBE77158-1D4D-46E3-A81F-9B918218FAFF (accessed on 12 January 2021).

[9] Slovenia’s Ministry of Foreign Affairs (2017), Evaluation of Slovenia’s Development Cooperation with Montenegro 2013-2016: Final Report, Slovenia’s Development Cooperation, https://www.gov.si/assets/ministrstva/MZZ/Dokumenti/multilaterala/razvojno-sodelovanje/Development-cooperation-with-Montenegro-evaluation-final-report.pdf (accessed on 12 January 2021).

[32] UNEG (2013), Impact Evaluation in UN Agency Evaluation Systems: Guidance on Selection, Planning and Management, http://www.uneval.org/document/detail/1433 (accessed on 12 January 2021).

[27] Watanabe, K. (2016), Ex-Post Evaluation of Technical Cooperation Project “Inter-Communal Rural Development Project”, JICA, https://www2.jica.go.jp/en/evaluation/pdf/2015_0603847_4.pdf (accessed on 12 January 2021).

[24] White, H., R. Menon and H. Waddington (2018), “Community-driven development: does it build social cohesion or infrastructure? A mixed-method evidence synthesis”, No. 30, 3ie, https://www.3ieimpact.org/evidence-hub/publications/working-papers/community-driven-development-does-it-build-social-cohesion (accessed on 12 January 2021).


← 1. There may be instances where two institutions participate in an intervention, one as the implementing partner and another one as the funder. It is possible in this configuration to assess internal coherence from the perspectives of both the funder and the implementing partner.

← 2. An important caveat to this is that evaluating impact is much more likely to be useful when it is accompanied by an analysis of how impact is achieved and what can be done to increase impact versus when it is purely an accountability exercise.

← 3. A detailed and comprehensive discussion on methodological options for defining and assessing efficiency is set out in a BMZ working paper by Palenberg (2011[24]) which identifies three different levels: Level 0: Describing and providing an opinion on some efficiency-related aspects of an aid intervention. Level 1: Identifying efficiency improvement potential within an aid intervention. This provides a partial picture of the implementation processes, costs of inputs, conversion of inputs into outputs or conversion of outputs into outcomes. Level 2: Assessing the efficiency of an aid intervention in a way that it can be compared with alternatives or benchmarks. This is a comprehensive approach that includes a reliable estimate of all major benefits and costs.

In practice, level two assessment is rarely applied. Even an organisation with huge capacity such as the World Bank had noted by 2010 that there had been a large decline in application of cost-benefit analysis at the appraisal stage, even in the sectors where it was most applicable. The report highlighted the positive aspects in terms of rigour and links to subsequent project performance of using such a depth of analysis, but also the challenges involved (IEG, 2010[25]). The example of rural electrification (IEG, 2008[26]) is a relatively rare example of a full level two analysis. Other options available to evaluators include multi-criteria decision making/modelling. It is also important to note that such types of efficiency analysis are more likely to be applicable in certain sectors, for example, infrastructure, health and agriculture. An interesting example from agriculture on input subsidies, which uses a country case study approach drawing on a range of economic estimates and survey data, is cited in Box 4.10.

← 4. The literature on this topic is abundant but see for example: UNEG (2013[32]); Chambers et al (2009[35]); Leeuw and Vaessen (2009[33]); Belcher and Palenberg (2018[34]), and Gertler et al (2016[30])

← 5. Not to be confused with statistical significance, which often comes up under certain types of impact evaluations; see Gertler et al. (2016[30]) which discusses power calculations and related technical concepts in quantitative impact evaluations.

← 6. The 2002 Glossary (OECD, 2002[10]) definition of sustainability is: “The continuation of benefits from a development intervention after major development assistance has been completed. The probability of continued long-term benefits. The resilience to risk of the net benefit flows over time.”


This work is published under the responsibility of the Secretary-General of the OECD. The opinions expressed and arguments employed herein do not necessarily reflect the official views of OECD member countries.

This document, as well as any data and map included herein, are without prejudice to the status of or sovereignty over any territory, to the delimitation of international frontiers and boundaries and to the name of any territory, city or area.

The statistical data for Israel are supplied by and under the responsibility of the relevant Israeli authorities. The use of such data by the OECD is without prejudice to the status of the Golan Heights, East Jerusalem and Israeli settlements in the West Bank under the terms of international law.

Note by Turkey
The information in this document with reference to “Cyprus” relates to the southern part of the Island. There is no single authority representing both Turkish and Greek Cypriot people on the Island. Turkey recognises the Turkish Republic of Northern Cyprus (TRNC). Until a lasting and equitable solution is found within the context of the United Nations, Turkey shall preserve its position concerning the “Cyprus issue”.

Note by all the European Union Member States of the OECD and the European Union
The Republic of Cyprus is recognised by all members of the United Nations with the exception of Turkey. The information in this document relates to the area under the effective control of the Government of the Republic of Cyprus.

Photo credits: Cover © TFK.

Corrigenda to publications may be found on line at: www.oecd.org/about/publishing/corrigenda.htm.

© OECD 2021

The use of this work, whether digital or print, is governed by the Terms and Conditions to be found at http://www.oecd.org/termsandconditions.