The Federal Communications Commission (FCC) earlier this month took initial steps to reform the Internet Protocol Captioned Telephone Service (IP CTS) in an effort to modernize the system and resolve compensation and funding issues.
A popular critical communications service for people who are deaf or hard of hearing who communicate by speaking, IP CTS enables a person with hearing loss to call another person and simultaneously read captions of what the other party is saying via a special display screen on the phone or other web-enabled device.
Traditionally, a trained operator – a communications assistant (CA) or voice captioner – listens to the call, repeats what that person says, and, with the use of automated speech recognition (ASR) software, transcribes their words to text.
Among the items in its recent report and proposed rulemaking, the FCC determined that improvements in ASR technology have made the use of speech recognition by itself an acceptable alternative to the CA-assisted method described above. The FCC also argues that ASR would provide faster, more private captions than those created by voice captioners and a lower cost.
VITAC, however, believes it’s not about how the captions are created but, rather, the quality of those captions. To that end, we strongly believe that professionals – real people – trained to provide captions bring human sensitivities and contextual awareness to the captioning table that no ASR system can.
Indeed, when comparing the captions created by humans to those created exclusively by even the smartest of machines, there is an obvious disparity. ASR systems routinely fail to present names and technical terms properly, they stumble on accented or mumbled speech or background noises, omit punctuation, and can have difficulty determining the differences between what a speaker “said” and what they actually “meant,” among other things.
Errors in captions like “foreign voices” becoming “for invoices” or “barista” magically turning into “bar restock” are sometimes comical, but typically are no laughing matter when users are relying on accurate captions to help them communicate with someone on the other end of the line.
The FCC, in its report, acknowledged that there likely will be factors that influence ASR’s effectiveness and that automated speech software may best be used on certain types of calls, such as those to customer service centers, where there is likely to be less background noise and clearer articulation by call takers, or calls to friends, relatives, and colleagues, who are more aware of and sensitive to the user’s hearing loss and the need to speak clearly.
The commission’s ruling does not mandate ASR as the sole means of offering IP CTS, and providers will be able to choose among three methods of providing captioned telephone service — captioned service using fully automated ASR; service using CA-assisted ASR; and service using stenographic-supported IP CTS. Consumers, in turn, will continue to be able to select an IP CTS provider based on the overall quality of service and the manner in which captions are created.
Captions play a crucial function in the daily lives of millions, and provide a link to education, news, entertainment, employment, and emergency information as well as a connection to a world that many in the hearing community take for granted. And it’s for these reasons, and many others, that clear, concise, accurate captions are essential.