The abstract for our presentstion to the Conference on Disability, Accessibility and Representation in the Creative Industries (DARCI) in September 2025
Using AI-based tools to monitor subtitle quality.
The RNID's 2023 report, "Subtitle It," highlights the significant challenges viewers face in accessing subtitles through on-demand platforms. While the Media Act 2024 mandates minimum quotas for subtitle provision on "tier 1" services, subtitle quality remains an issue which impacts accessibility and viewer experience.
While AI-based speech-to-text tools cannot provide broadcast-quality subtitles, because they produce different types of errors, they can be used to monitor some of the problems which affect the quality of television subtitles and degrade the audience experience. This presentation will demonstrate how, using a modified version of Whisper, OpenAI's speech-to-text engine combined with natural language processing and a simple statistical approach, we can usefully quantify problems with timing and word omission in subtitles in broadcast and on-demand content.
We will show how these problems vary across different types of TV programming, including archive programmes with original subtitles that omit a substantial portion of the spoken words and live programmes, where subtitlers re-speak the dialogue or manually cue pre-prepared blocks, leading to subtitles that (usually) lag the speech along with the omission and reordering of words.
These problems will be illustrated with examples from broadcast television and on-demand content, including technical faults and workflow issues. The examples will also highlight the challenges of aligning subtitles with a speech-to-text transcript, given that this work has revealed examples of subtitles omitting around 40% of the spoken words and subtitles arriving between 20 seconds early and 50 seconds late as well as an example of a programme broadcast with the subtitles for a different episode.
We will conclude with some observations on current practices and historical trends in TV subtitling and discuss the need for improved quality control and monitoring of subtitles provided for broadcast and on-demand programmes.