Transcription and Accessibility—New Partnerships from Microsoft and Amazon

On the heels of one another, two tech titans recently announced higher-education partnerships that leverage transcription technology to make educational materials more accessible to a broader swath of learners.

On April 5, Microsoft announced a partnership with the Rochester Institute of Technology in New York. Via Microsoft Translator, a translation service, students in classes and lectures can get automated transcriptions on their mobile and desktop devices. Professors can also choose to show the transcriptions on a big screen behind them. The partnership’s primary aim is to support students who are deaf and hard of hearing.

This is Microsoft’s first large-scale deployment of this technology to a higher-ed institution, says Xuedong Huang, a technical fellow who oversees Microsoft’s work in speech, natural language and machine translation. Nine classes at the university are currently piloting Microsoft Translator, according to Brian Trager, the associate director for the National Technical Institute for the Deaf’s Center on Access Technology. NTID is one of RIT’s nine schools. A Microsoft spokesperson says over email that this technology is publicly available, and that the company is aware of dozens of K-12 schools that are using it in the classroom.

The company says its transcription technology is also useful at a broader level. Huang tells EdSurge the technology supports 60 languages, meaning a student listening to a professor lecture in English can get a transcription to, say, German.

Amazon followed with an announcement of its own on Monday. The e-commerce giant announced that Amazon Transcribe, a service that converts audio from speech to text is partnering with Echo360, a video-platform for higher education institutions, to provide automated captioning of lectures that will be displayed side-by-side along the video. Students will be able to download the transcripts and reference them later.

“It’s going to improve the level of engagement, which is what correlates directly to better grades,” claims Echo360 CEO and founder Fred Singer.

An Amazon spokesperson says over email that Echo360 has used AWS services since 2013. Singer said Echo360 was considering other companies—it “had the full range of all the major companies in the world that do this type of work.”

The Amazon spokesperson adds that Amazon continues to “look for ways to work closely with academic institutions as customers and collaborators.”

Amazon and Microsoft are certainly not the first companies to think about transcription. Existing services used in the education sector include GoTranscript, which says it uses professional transcriptionists. Founded in 2005, the Edinburgh, Scotland-based company claims on its website that it serves “over 50 top universities for lecture transcription.” San Diego-based Temi claims to use algorithms to deliver quick transcriptions for corporate and higher-ed institutions.

Effectiveness of Transcription

Both Echo360’s Singer and Microsoft’s Huang believe their respective partnerships will help students who are deaf and hard of hearing. These tools could also help educational institutions who have found themselves in legal trouble.

In 2015, the National Association of the Deaf sued Harvard University and the Massachusetts Institute of Technology for “failing to provide closed captioning in their online lectures, courses, podcasts and other educational materials.” And a federal judge ruled in 2013 that Creighton University’s medical school “must provide a deaf student with an interpreter and a transcription service.”

Beyond legal compliance matters, some educators also believe transcription of lectures can better support learners of all abilities and needs. The improvements that Stuart Dinmore and Jing Gao of the University of South Australia noted in a 2016 paper include more “accessibility for deaf or hard of hearing viewers,” better “comprehension for all students” and support for those who are learning a new language.

However, researcher Juan Cristobal Castro-Alonso, who works at the Center for Advanced Research in Education at the Universidad de Chile, has a different perspective. He describes to EdSurge the modality effect, which says that it’s better to learn from narrations and visuals rather than texts and visuals. For students who are not deaf, he says, reading on-screen text in addition to watching a professor give a lecture could make it difficult to pay attention.

He explains that there are gray areas. For instance, it may not be a problem if a student doesn’t necessarily have to watch the professor as he or she speaks. But if a professor is demonstrating something in class that requires the student to watch, such as how to play a violin, then it becomes an issue for the student to also read text on screen. His or her attention would be split between the demonstration and the text.

Castro-Alonso adds that reading the transcription notes while not watching images or videos— such as the case of a student who looks at the transcription after watching a lesson—would not be problematic.

Accessing the transcripts after class is what RIT professor Sandra Connelly says her students enjoy doing in order to review for exams, and for help with their homework. She says that the new technology, however, isn’t perfect.

“There are always hiccups in new technology, and the students are fairly forgiving of that,” writes Connelly. “I think if the error rates were even lower..., even more students would find the captions useful.”