Text-based audio editing in radio production, BBC Dialogger // Chris Baume, BBC R&D


  • 9 stations, 34 million listeners -- BBC is pretty big
  • Transcription is rough and manual process (enough for them to understand whats going on -- slow and "waste of their time")
  • Producers will pay other people to do it, in Australia which is overnight because of how time zones work, but only if they have the money
  • Timed transcripts in 2002, SCANMail, meant for navigating voicemail using text
  • Also this program called SILVER
  • 2004, SCANMail could edit text
  • Then a video editor can do transcripts with different camera angles and it auto edits the video so there are no awkward jump cuts
  • (Association of Computing Machinery) ACM is the journal where all these examples are coming from: https://www.acm.org/publications/journals
  • Removing "ums" and other sounds
  • 2016: prototype to quickly revise spoken comments
  • Chris is now demo'ing some prototypes out of BBC R&D
  • HTML5Compositor
  • Demonstrates the Magic Pen tool where transcripts can be printed, written on with a specific pen with a camera, and having those edits uploaded back to the document so producers can work at their leisure / on the go / not in the office
  • "Speech to text is a lossy process"

Ideas for next steps

  • Common base UI element for timed transcript editing
  • Google Docs style collaborative time transcript editor/player
  • Better, meaningful annotations (e.g. rate segments, export >4*)
  • Template for EDL file generation
  • Embed transcript and annotations in audio file
  • Umm detection/removal (STT with umms?)
  • Automatic segmentation with tagging and summaries
  • Better time compression
  • Tools for recording multiple versions of a script
  • Digital pen with audio playback, natural annotation and live bidirectional sync
  • Smart correction by exposing STT graphs
  • Fast clipping of a live audio stream using transcripts
  • Bidirectional integration with a proper audio editing system

