Evaluating AI in health and care is essential

29 September 2020 Kassandra Karpathakis (Head of AI Policy, NHSX)

AI Lab AI

We recently announced the winners of the 2020 Artificial Intelligence (AI) Award. In partnership with the Accelerated Access Collaborative (AAC) and the National Institute for Health and Research (NIHR), we will be supporting our awardees to bring innovative AI solutions into the health and care system.

To ensure the adoption of the most promising technologies is evidence-based, effective and can be scaled nationally, we are also conducting real-world evaluations. This means we will be evaluating the effectiveness, accuracy, safety and value of each technology based on its performance across selected partner sites.

Each later stage technology in the AI Award will be independently evaluated in line with the National Institute for Health and Care Excellence’s (NICE) Evidence Standards Framework for Digital Technologies, which classifies technologies according to their function, features, and associated risks. The length and scope of the evaluations will depend on the nature of the AI technology, how well it performs when implemented (leaving room for learning and iteration), and the type of evidence we need to collect.

Because of the complex nature of implementing novel AI solutions into the health and care system, our evaluation approach will span across all three established evaluation types:

Process evaluation: here we will examine the deployment and operational implications of the AI solution. We will focus on questions relating to the theory of change (‘should it work’) and practical issues with trying to use the technology for its intended purpose (‘can it work’).
Impact evaluation: we will focus on whether the AI solution has achieved its intended effect compared to an agreed alternative (‘does it work’). We will measure impact by seeking to attribute any observed improvement in operational or clinical outcomes to the deployment of the AI solution as opposed to other confounding factors.
Economic evaluation: we will focus on whether the AI solution offers value for money (‘is it worth it’). Our economic evaluation will build on the process and impact evaluations to understand and measure whether the AI solution is a valuable addition to the health and care system.

Throughout all of this, our evaluation teams will consider the perspectives of different stakeholders, including patients, carers, and protected groups. We want to know if AI technology really does improve outcomes for people and, if it does, if it’s suitable for national roll out across different communities.

This is the first time there will be such a high concentration of evaluations for AI solutions in health and care, presenting an exciting opportunity for us to work with partners, such as NICE, to develop new methods and establish best practice for assessing the use of AI in the health sector. This partnership working will ensure that the UK’s evaluation approach is methodically world-class and accelerates the accelerated spread of effective innovation. By working in the open and sharing insight, we hope to motivate and enable others designing, implementing and evaluating AI in health and care.

Last, but not least, we would like to acknowledge and thank the following contributors who have helped NHSX and the AAC set-up the AI Award evaluation approach: Jacqueline Mallender; Professor Alastair Denniston; Dr Xiaoxuan Liu; Professor Chris Holmes; Professor Sebastien Ourselin; Simon Walsh; Samantha Cruz Rivera; Michela Antonelli; Dr Patrik Bachtige; Rohan Boyle; Bernadette Scanion; and Abdullah Mahmood.

If you’re interested in learning more about different evaluation methods and how to conduct your own, check out Public Health England’s step-by-step guide to evaluating digital health products.