Bringing Sounds to Life: Making Sound Effects (and BB-8) Accessible to All

by: David Titmus

What’s the best way to describe a sound to someone? How would you capture, in words, the sound of an explosion as a thief blows up a bank vault, the thud of doors slamming as the robber runs out of the building, or the screech of a getaway car’s tires as it peels away from the scene of the crime?

VITAC’s highly skilled professionals caption prerecorded programming on more than 100 television networks. Our captioners are masters of description, line breaks, timing, readability, and, yes, sound effects that many viewers may not notice or even consider.

The Federal Communications Commission rules for television closed captioning ensure that viewers who are deaf and hard of hearing have full access to programming, requiring that captions be accurate, synchronous, complete, and properly placed. And this includes not only the dialogue on the screen, but also sound effects and other non-verbal cues.

Captioner Working at the ComputerA submarine’s sonar pings during a nautical game of cat-and-mouse in a wartime drama; a dog barks as an intruder approaches the house in that late-night thriller; or Iron Man’s thrusters engage as he swoops in to save the day in the latest superhero blockbuster. These all are crucial (or, in Iron Man’s case, just plain cool) pieces of information that, while not spoken on screen, add important details to the action or help set the mood.

Take, also, the role that music can play in films — it evokes emotion, ramps up the action, or otherwise helps and enhances the storytelling — and how missing that element in certain scenes can leave a film lacking. Captioners also add musical sound effects noting such things as when the music starts, stops, or intensifies, whether there are vocals or just instrumentals, and whether the musical tempos slow or quicken.

Offline captioners describe any sound effect not actively represented on screen (unless, however, a speaker references an obvious on-screen sound — such as “don’t slam that door!” — in which case, captioners will include that as well). But trying to determine the best way to describe a sound, tone of voice, or musical interlude can be challenging. To that end, VITAC has compiled an ever-evolving reference list of “frequently used” sound effects, modifying them and creating new ones as necessary to sufficiently describe a particular sound.

Typically, captioners create descriptors to describe how something is spoken, and create sound effects to describe a sound. Descriptors generally do not include nouns, and are used along with dialogue to describe how something is being said. Examples of this include “[ Slurring ] I need another drink” or “[ Breathy ] Happy Birthday, Mr. President.”

Sound effects, on the other hand, generally include a noun and a verb. They are used to describe a sound heard or characters speaking without using words. They can include: “[ Dog yelping ],” “[ Dog barking incessantly ],” or “[ Dog whining ]”.

Think it sounds easy? Imagine then how to describe to someone the sound a lightsaber makes in the Star Wars universe? Or the best way to capture the raspy voice of Darth Vader? Or the slight inflection in the chirps and beeps that R2-D2 or BB-8 make to communicate with their counterparts, whether they be human, droid, or Wookie?

BB-8 with captioned dialogue

For instance, here are some of the neat ways captioners described astromech BB-8’s “dialogue” – the robotic beeps and whirrs – in 2015’s “The Force Awakens”: (chirps nervously), (chirps angrily), (chirps merrily), (chirps inquisitively), and (chirps uncertainly). The little droid certainly had a personality in the movie, and the captions do a very nice job of capturing that.

VITAC’s offline captioners seek the most effective ways to describe a sound and always strive for the most creative way to determine the specific sound effect to use, like a [ crack! ], [ boink! ], or [ whip! ] in a “Three Stooges” episode. It’s this sort of imagination in creating sound effects that puts our human-generated captions heads and shoulders above the offerings of automatic speech recognition programs.