Audio annotation tools

I’m looking for tools to enable me to annotate audio files. I would love any input and thoughts, and if I manage to find a slick way to do what I want I’ll use this topic to share.

I want…

  • something I can use on my notebook/desktop. It can be online (a website, for example), or it can be software that runs on my computer.
  • must juggle three things: An audio file, a transcript, and my annotations.
  • must enable me to somehow work with my annotations in aggregate; search in my annotations, or tag my annotations, or something. I want to be able to “follow a theme” through my annotations, across all the audio-and-transcripts
  • playback synchronizing the audio and the transcript

I’m concerned about…

  • long–term usability. I’m concerned about loading hundreds of files into someone’s service, adding tons of work to annotate… and then having it all go away. So the simpler it is overall, the better. Standard file formats, simple files.

This tool doesn’t have to create the transcripts. I’m imagining, I have an audio file, and a transcript, and then I want to put those files into this tool.

I’m imagining I’ll need to have the transcripts be created in a formate beyond simple plain text. Most of us talk about “transcripts” and we mean a plain text file that has everything meant to be read by people. There are also subtitles formats that are meant to be read by software; that way the audio playback and the text display are synchronized. (Tools like Descript and Otter and Rev do this internally, and allow us to export plain text. Some even let us export the subtitles-format, which is what I’d have to use to get this idea working.)

That’s my thinking so far…



This is one tool that I’m bookmarking for digging into…


My initial thoughts were SoundCloud and YouTube, but they could change their tech and leave you high and dry.

You could just add notes to the closed caption files, like the way music is described for some audio. “:musical_note:Uplifting music is playing” Perhaps even use an emoji for all your annotations, eg :spiral_notepad:.
This would be application independent, but it would be limiting if they are long annotations.


Using a subtitle file(s) is what would seem to work best. That will give you multiple text files covering the same timeline.

I think the key is the use a media container format like the .mkv Matroska format so you could hold the subtitles and audio in the same file container. And .mkv isn’t going away anytime soon, plus I think you can add chapters.

Premiere doesn’t handle .mkv but Da Vinci Resolve does, and it’s free for what you want.

I don’t know if mkv can handle wav but I’m pretty sure there is at least one lossless codec it can handle (flac?).

I guess other video container formats that can handle subtitles might work, but you’d have to check if they limit your audio quality options and formats.

Sooooooo glad I asked all this!

I was looking, and services Otter.aim and can export .srt format to give me subtitle data files for automated transcription files.

I’m beginning to see a system here . . .

yeah drop the Otter srt into Da Vinci with the audio file. then use Da Vinci to annotate comments to another srt file. and save everything as mkv.

And Da Vinci has a top class audio editor built in called Fairlight - is to class

Another thing, this time an App, that seems closely related to annotating… a way to capture bits of books, podcasts, etc that you want to remember. Not a good fit for what I’m trying to solve, but worth capturing here…