Chapter 12: Evaluation approaches

Tess Tsindos

Learning outcomes

Upon completion of this chapter, you should be able to:

  • Identify the key terms, concepts and approaches used in evaluation.
  • Explain the methods of data collection and analysis for evaluations.
  • Discuss the advantages and disadvantages of different evaluative approaches.

What is evaluation?

There are many definitions of ‘evaluation’. Mertens and Wilson1 quote from Fournier:

Evaluation is an applied inquiry process for collecting and synthesizing evidence that culminates in conclusions about the state of affairs, value, merit, worth, significance, or quality of a program, product, person, policy, proposal, or plan. Conclusions made in evaluations encompass both an empirical aspect (that something is the case) and a normative aspect (judgement about the value of something). (p6)

Hawe and colleagues provide the following definition of (program) evaluation2:

… program evaluation usually involves observing and collecting measures about how a program operates and the effects it appears to be having and comparing this to a preset standard or yardstick. (pp6–7)

A more clearer definition, however, is this one from the Better Evaluation website (para1):

any systematic process to judge merit, worth or significance by combining evidence and values.

Many words are used interchangeably to refer to evaluation, such as ‘appraise’, ‘review’, ‘assess’ and ‘interpret’. This is because the definition differs between disciplines and sectors, depending on the aims and outcomes of what is being evaluated. While evaluation is different to research, it is often considered similar because the underlying principles of collecting and assessing research evidence are the same.3 The differences and similarities between research and evaluation are outlined extensively on the Better Evaluation website.

Approaches to evaluation

Given that evaluation is conducted across many sectors, spanning education, agriculture, community services, international development, policing and justice, health, and others, it is unsurprising that a diverse array of evaluation approaches have been developed. Stufflebeam describes 22 program evaluation approaches in his widely cited monograph.4 These approaches are distinguished by factors such as their purpose, scope, engagement with stakeholders, methods, timing and applications. In this chapter we focus on four approaches that are most often used in health and social care (Table 12.1.).

Objectives-based evaluation

Objectives-based evaluation methods are widely used in the education sector and public health. The focus of this approach is the assessment of whether the program’s objectives, which must align with the identified needs of program participants, are achieved. This approach places strong emphasis on valid measurement of program effects. It is also referred to as ‘impact evaluation’.

Program and service managers, funding bodies and researchers usually determine the questions that guide these evaluations and the appropriate measures to answer them. While experimental or quasi-experimental methods are not essential to this approach, they are reasonably commonly used in public health to address questions concerning the causal relationships between interventions (independent variables) and the achievement of objectives (dependent variables). The objectives-based approach has been criticised for placing too great an emphasis on a tightly prescribed set of program endpoints, with insufficient attention to the process of implementing a program. However, the collection of process information is compatible with the approach, particularly for determining the association between intervention exposure and outcomes.

Empowerment and participatory evaluation

Empowerment and participatory evaluation is distinguished by the prominent role it gives to program participants in the evaluation process. Participants must determine the evaluation questions, consistent with the interests they have in the program, and make decisions about the appropriate methods to use. Participants may be heavily involved in data collection, analysis and dissemination. The evaluator must respect participants’ choices in all aspects of the inquiry and facilitate and build their capacity to control the evaluation in their preferred ways. Including program participants in an evaluation can be time consuming and require additional resources.

The empowerment approach values the rich and diverse insights that participants bring from their lived experiences, and places greater emphasis on relevance than rigour. One perspective is not regarded as more ‘true’ than others, and the purpose of the evaluation is not to reach ultimately correct conclusions but rather to empower participants (especially people who are disenfranchised) and catalyse social change through raised consciousness.5 Qualitative methods, and participatory action research (see Chapter 7), are commonly used in empowerment evaluation.

Realist evaluation

The realist evaluation approach has been advanced by Pawson and Tilley. It shares many elements of the theory of change perspective and is founded on the critical realist paradigm.6 Researchers adopting this approach maintain that interventions are theories concerning how a set of activities will operate in given social contexts to bring about change and the achievement of desired objectives. A key role for evaluators, therefore, is to work with program managers and other stakeholders to ensure the theory of change is inherent and explicit within a program; this is usually represented as a program logic model, which is used to guide the evaluation.

The realist evaluation approach is characterised by the purposive sampling of a wide variety of quantitative and qualitative information to shed light on the generative mechanisms of change that take place in the program, and the contextual factors that determine whether they are activated or not. The focus on learning about how change is achieved (rather than the assessment of program effects), together with its recognition of the critical role of context (defined in the broadest sense) and attention to both intended and unintended consequences, has made this an appealing approach for the evaluation of complex interventions.6

Utilisation-focused evaluation

Patton argues that evaluations should be undertaken for specific intended uses.7 A critical role for the evaluator, therefore, is to undertake an analysis of program stakeholders, to identify the primary users of the program and to determine the needs and associated questions they have concerning the program. Utilisation-focused evaluation has a neutral value base in that the approach acknowledges that evaluation may be undertaken to assess processes, impacts and cost–benefit, to bring about improvement, to generate knowledge, or for other purposes determined by stakeholders.

The utilisation-focused approach may adopt quantitative, qualitative or mixed methods, and these decisions are also guided by the interests of stakeholders (all those involved in a program, including those who develop, deliver and benefit from the program) and their views about the kinds of data that are credible and useful. The evaluator can facilitate this decision-making by presenting a menu of evaluation methods to stakeholders and providing expert advice to enable assessment of the utility, validity and cost-effectiveness of different options.  This approach provides a wide ranging set of design options to stakeholders making it more expensive to conduct than applying a single evaluation design option. As Patton states, ‘… by actively involving primary intended users, the evaluator is training users in use, preparing the groundwork for use, and reinforcing the intended utility of the evaluation every step along the way’ .7(p38)

No single evaluation approach is better than the other, and multiple approaches can be drawn upon within one evaluation project, depending on the purpose of the evaluation.8

Table 12.1. Summary of the four evaluation designs

 

Objectives-based (also known as ‘impact evaluation')

Empowerment and participatory

Realist

Utilisation-focused

Key idea

Set and meet objectives

Share power

Context

Useful

Seeks to

Set clear objectives and achieve them.

Measure program effects.

Address questions concerning the causal relationships between interventions and the achievement of objectives.
Actively involve program participants, practitioners and community in the design, delivery and analysis of the program and its evaluation.

Ensure the community identifies issues, how to address the issues, monitor progress and use information to sustain the program.
Understand and explain ‘what works, for whom, under what circumstances, and how’ 6(p15)

Attempts to explain how outcomes are caused by identifying the underlying reasoning of people – the context makes a difference.
Provide evaluation findings that are worth using and can inform decision-making.

Data collection

Data is collected after the program has been fully implemented.

Data collection, methods and tools should be guided by the specific types of data that are needed to answer the evaluation question(s).
Data is collected at any point, with the community defining design, delivery and outcomes.

Data collection, methods and tools should be guided by the types of data that are needed to answer the evaluation question(s).
Data is collected once the program has had time to operate and outcomes can be evaluated.

Data collection, methods and tools should be guided by the types of data that are needed to answer the evaluation question(s).
Data is collected once the program has had time to operate and outcomes can be evaluated.

Data collection, methods and tools should be guided by the types of data that are needed to answer the evaluation question(s).

Sample considerations

Not a consideration. Whether the program achieved its goals is being evaluated. Sample size is not usually relevant but dependent on program size, participant involvement, and evaluation questions. Sample size is dependent on program size and evaluation questions. Sample size is dependent on program size and evaluation questions.

Advantages

Identifies where programs can be improved or modified to meet objectives, or discontinued.

Attempts to make unbiased, balanced observations about the program’s outcomes are based on verifiable data.

Can provide clear recommendations based on what has worked.
Shared control in the evaluation to define issues, judge effectiveness, set directions and influence the flow of resources.The insider role is valuable for interpreting findings.

Can create continuity between evaluation, planning and action.
Well suited to assess how interventions in complex situations work.

Can lead to a shared understanding of the intervention (program) among people involved.

Most appropriate for evaluating new programs that work, but why they work is not understood.
Advocates for a participatory approach where key stakeholders assume ownership of the evaluation.

an provide solid evidence of program effectiveness.

Focusses on real and specific users and uses.

Disadvantages

Evaluators need to remember that their own assumptions will influence the interpretation of findings and recommendations. Community members may have low levels of objectivity and/or limited ability to contribute to planning and development.

Time-consuming and requires additional resources.

May raise false expectations about future support for the program.
It is a theory-driven approach than can be difficult to understand.

Evaluation scope needs to be clearly set.

an be more expensive than a simple evaluation design.
Time-consuming to involve the full range of stakeholders.

Not necessarily a linear evaluation process.

Datacollection methods and analysis

The methods of data collection and analysis rely on correct evaluation designs. Evaluation designs describe a set of tasks to systematically examine the effects of a program. A good study design creates confidence that the program caused the observed changes; because there is optimal planning, the best possible measures are used to assess the impacts of the program, no alternative explanations exist for the results and it is possible to identify how the program worked on the target population.9

It is not possible to provide an in-depth discussion about evaluation designs here, but they follow the same principles of research designs used in academic research: observational, quasi-experimental and experimental. Observational designs do not use control or comparison groups and usually measurements are taken from one point in time. They include pre-tests and post-tests, time-series designs and case studies. Quasi-experimental designs are generally more suited to public health than other types of evaluations, and aim to use a comparison group. They do not, however, randomly allocate to the groups. Quasi-experimental designs include reciprocal group design, historical control group design and stepped intervention design. Experimental study designs expose a group of people to an intervention, and this group is then compared to a control group that was not exposed to the intervention. The main types of experimental designs are randomised controlled trials (RCT) and cluster randomised controlled trials (CRCT).10

Data collection methods (discussed in Section 3) also depend on the level of evaluation being conducted. The four levels of evaluation are:

  • Formative: prior to implementation testing and refinement
  • Process: implementation of strategies
  • Impact: attainment of objectives
  • Outcome: attainment of goals.

Examples of these evaluation levels are outlined in Table 12.2.

Data collection measures rely on the level of evaluation being clearly articulated and planned, with clear objectives and data collection points identified. The appropriate method of data collection will be the one that can gather the information necessary to answer the evaluation questions. For example, quantitative measures and tools are best suited to collecting information about attendance at training programs (surveys and attendance sheets), but qualitative measures and tools (interviews and focus groups) are best suited to collecting information about participants’ experiences of the training. The evaluator must become familiar with the program and its goals, so as to choose the most appropriate data-collection methods and tools.

Data analysis follows the same principles as academic data analysis, which are explained in Section 4. For example, numerical data such as cost, attendance numbers and biometric measures are analysed for patterns, correlations, cross-tabulations, frequency tables and more. Textual analysis of spoken or written words can be analysed through content, themes, framework matrices and timelines.

Table 12.2. Examples of evaluation levels

Formative

Process

Impact

Outcome

Title

Primary care provider perceptions and experiences of implementing hepatitis C virus birth cohort testing: a qualitative formative evaluation11

A mixed methods process evaluation of a person-centred falls prevention program12

Design of an impact evaluation using a mixed methods model – an explanatory assessment of the effects of results-based financing mechanisms on maternal healthcare services in Malawi13

A realist evaluation to identify contexts and mechanisms that enabled and hindered implementation and had an effect on sustainability of a lean intervention in pediatric healthcare14

First author and year

Yakovchenko, 2019

Morris, 2019

Brenner, 2014

Flynn, 2019

CC Licence CC BY 4.0 CC BY 4.0 CC BY 4.0 CC BY 4.0

Aim

‘To inform design and implementation of a quality improvement intervention, ...studied primary care provider (PCP) perceptions of and experiences with HCV birth cohort testing.'[abstract]

‘to determine whether RESPOND [a falls prevention program] was implemented as planned, and identify implementation barriers and facilitators.’[abstract]

‘To provide an example of how qualitative methods can be integrated into commonly used quantitative impact evaluation designs.’[para12]

To use 'the context (C) + mechanism (M) = outcome (O) configurations (CMOcs) heuristic to explain under what contexts, for whom, how and why Lean efforts are sustained or not sustained in pediatric healthcare.'[abstract]

Intervention or program

No intervention. Formative evaluation is about testing the intervention in its early stages.

A person-centred falls prevention program for people presenting to the emergency department following a fall.

Consists of 4 evidence-based modules: (1) Better strength and balance, (2) better bones, (3) better eyesight and (4) better sleep. Program included telephone coaching, positive health messaging and motivational interviewing for 6 months.

Reduce maternal and neonatal mortality by introducing the RBF4MNH Initiative. The Initiative seeks to upgrade infrastructure, have quality based performance agreements in place and provide women who have delivered babies a monetary compensation.

Study design

A formative evaluation using interviews

Convergent, parallel mixed methods

Explanatory mixed methods design

A case study realist evaluation using interviews

Data collection and participants

Semi-structured interviews with 22 primary care providers guided by the integrated Promoting Action on Research Implementation in Health Services (i-PARIHS) framework.

RESPOND trial intervention participants (n=263) and healthcare professionals involved in delivering the program (n=7).

Focus groups and surveys with trial participants, interviews with health professionals, audit of telephone sessions to assess adherence to the trial protocol, data extraction from trial database to assess recruitment, and dose.

Three study components; collection of quantitative data that described quality of care at facilities; quantitative data collection of health care utilisation at community level; a mixture of non-participants observations, in-depth interviews and focus groups.

Interviews with 32 participants from 4 pediatric units and neonatal intensive units.

Analysis

Content analysis with a priori and emergent codes was performed on verbatim interview transcripts.

COM-B framework used to guide analysis.

Inductive and deductive coding for thematic analysis. Descriptive statistics for quantitative data analysis

Quantitative consisted of a controlled pre- and post-test design with difference-in-difference analysis. Grounded theory approach for qualitative data. Then triangulation across both elements.

Interviews were analysed using context, mechanism, outcome configurations (CMOc) heuristic.

Results

A multi-component intervention for awareness and education, feedback of performance data, clinical reminder updates and leadership support, to address both a significant need and to be deemed acceptable and feasible for primary care providers.

Implementation of the falls prevention program was at a lower dose than planned; however, health professionals delivered the program as planned. Facilitators were positive; health messages and person-centred approach. Complex health and social issues were barriers.

The design is expected to create robust evidence measures of outcomes and generate insights about how and why the intervetions produce intended and unintended effects.

A causal link between implementation and sustainability was demonstrated. Sense-making and engagement were identified as critical mechanisms to sustainability. It provides practical guiding principles that health care leaders may incorporate into planned LEAN implementation.

Summary

Evaluation is a discrete area of research yet shares many similarities with research conducted in the health and social care field. To create well-planned and delivered program evaluations, it is necessary to understand the approaches, levels and study designs.

References

  1. Mertens DM, Wilson AT. Program Evaluation Theory and Practice: A Comprehensive Guide. Guilford Press; 2012.
  2. Hawe P, Degeling D, Hall J. Evaluating Health Promotion: A Health Workers Guide. MacLennan and Petty; 1990.
  3. What is evaluation? Better Evaluation. 2022. Accessed March 8, 2023. https://www.betterevaluation.org/getting-started/what-evaluation
  4. Stufflebeam D. Evaluation models. New Dir Eval. 2001;89:7-98. doi:10.1002/ev.3
  5. Fawcett SB, et al. Empowering community health initiatives through evaluation. In: Fetterman DM, Kaftarian SJ,Wandersman A, eds. Empowerment Evaluation: Knowledge and Tools for Self-assessment and Accountability. SAGE; 1996.
  6. Pawson R, Tilley N. Realistic Evaluation. SAGE; 1997.
  7. Patton MQ. Utilization-focused Evaluation. 4th ed. SAGE; 2008.
  8. Harris M. An introduction to public and community health evaluation. In: Harris M, ed. Evaluating Public and Community Health Programs, 2nd ed. John Wiley & Sons; 2017.
  9. Australian Institute of Family Studies. Evaluation design. June 2021. Accessed March 8, 2023. https://aifs.gov.au/resources/practice-guides/evaluation-design
  10. Evans D. Hierarchy of evidence: a framework for ranking evidence evaluating healthcare interventions. J Clin Nurs. 2003;12(1):77-84. doi:10.1046/j.1365-2702.2003.00662.x
  11. Yakovchenko V, et al. Primary care provider perceptions and experiences of implementing hepatitis C virus birth cohort testing: a qualitative formative evaluation. BMC Health Serv Res. 2019;19:236. doi:10.1186/s12913-019-4043-z
  12. Morris R, et al. A mixed methods process evaluation of a person-centred falls prevention program. BMC Health Serv Res. 2019;19:906. doi.:10.1186/s12913-019-4614-z
  13. Brenner S, et al. Design of an impact evaluation using a mixed methods model – an explanatory assessment of the effects of results-based financing mechanisms on maternal healthcare services in Malawi. BMC Health Serv Res. 2014;14:180. doi:10.1186/1472-6963-14-180
  14. Flynn R, et al. A realist evaluation to identify contexts and mechanisms that enabled and hindered implementation and had an effect on sustainability of a lean intervention in pediatric healthcare. BMC Health Serv Res. 2019;19:912. doi:10.1186/s12913-019-4744-3