About those Sign-to-Speech Gloves (& “Better Than Nothing”)

I had a lot of people send me the link to the news release about two young inventors who have supposedly discovered a way to translate ASL into spoken English. The HuffPo article about it is linked here.

Confession: I rolled my eyes when I saw it. 

I just couldn’t believe that something like that could actually do what it claims it will. I mean, how on earth could it ever capture the facial expressions that are so necessary to signed language? How? At best, it seemed to me to be a sort of tangible YouTube captions, taking overly exaggerated gestures and maybe sometimes getting it right.

But did I say that to anyone who sent me the link?

Why no. No, I didn’t.

I am pushing myself to stay as positive as possible about things, and trying to be all, “yay!” instead of “nay.”

So I didn’t say anything besides the fact that the invention looks cool and I hope it will work. Both are true: I do think the invention looks cool and I hope it will work. I just don’t personally think it will.

Some new articles by people far more expert than I have emerged and are worth reading. This is one: Ten reasons why sign-to-speech is not going to be practical any time soon. It’s really fantastic, and kind of gives me the balls to talk a bit about something that bothers me.

I don’t actually think that implementing some type of rudimentary, less-than-perfect technology as a part of disability access is all that helpful. I think you should get it right, or test it more.

This is the problem, as I see it: when something is invented or created as a temporary access solution even though it is far from perfect, too often the real solution is placed on the back burner. It becomes like, “yeah, well, it’s better than nothing,” so the permanent ramp isn’t placed, correct captions are not developed, appropriate class supports are not implemented. Nothing to me illustrates this better than YouTube captions.

YouTube captions take speech and auto-caption it. Have you ever gone there, turned the sound off completely (if you are hearing) and relied solely on the auto-captions to guide you through what is happening?

If you have, then you know it’s a headache. It’s confusing. It’s often gibberish. It’s real work on my part to fully discern content.

But let me tell you! When I ask for captions for videos, I am told more often than not that the “YouTube captions are there!” People don’t bother to caption their videos because they are relying on those crappy YouTube auto-generated ones, which is supposed to be better than nothing. I personally think they are worse than nothing, because they make people try less and put the full burden of figuring content out squarely on the person who needs the captions.

Most of the time, when I see that my only course of action is to use the auto captions, I quit. I won’t even go there anymore, I’m just too sick and tired of trying to figure out a bunch of content that makes no sense.

Given that, it’s not better than nothing for me; it is nothing. And it’s a  nothing without recourse – I can’t knock politely on the video creator’s door and ask for captions because they simply say, “but YouTube captions are there!”

This idea of creating gloves to translate ASL might be a great one. But I sincerely hope they don’t come remotely close to marketing it unless and until they actually have it down. Sending yet-another thing out into the disability community that doesn’t actually do the job doesn’t make our lives easier; it makes it harder. Because not only are we going to have to still try and figure out how to access content, but we’re going to have to battle the notion that it’s done already by the imperfectly designed “better than nothing.”

:// end rant.

Meriah
is a deaf blogger, global nomad, tech-junkie, cat-lover, Trekkie, Celto-Teutonic-peasant-handed mom of 3 (one with Down syndrome and one gifted 2E).
She likes her coffee black and hot.
Meriah on EmailMeriah on FacebookMeriah on GoogleMeriah on InstagramMeriah on LinkedinMeriah on PinterestMeriah on TwitterMeriah on Youtube

5 Comments

  • Sign to speech is more kanji to romanji than Japanese to English. Some people don’t know how to add captions to their videos. (It can be done with annotations.)

  • Your comments about YouTube captioning hit home. I have only posted one public video on YouTube. When I did, I took the time to caption it myself because I didn’t want it to be auto captioned. Did it take me almost 44 minutes to caption a 7 minute speech? Yes. Was it worth it? Yes. Would I do it again? Yes.

  • Those glove will eventually get better, but unless I’m missing something, the primary benefit is for the person with hearing. Where’s the reciprocal technology that translates speech to sign? Anyhow, re YouTube captioning, the more auto-captioning is use, the better it will become, and that’s part of the goal behind them — at least according to Ken Harrenstein, a Deaf engineer at Google (who spoke at a DREDF event in 2012). From a 2011 article in Scientific American:

    “Harrenstien, who is deaf, is the principle engineer behind the infrastructure that serves, manages and displays captions and a primary motivating force for the company’s captioning projects.

    Harrenstien recounts that most of the team working on the captioning project was “extremely concerned” about the quality of the first auto-captions. “I kept telling them over and over and over that, as one of the potential beneficiaries, I would be ecstatic to see even the most inaccurate captions generated by our algorithms,” he says. “Most people don’t realize that TV captioning for live events [such as sports] is generated by humans but can still often be atrocious to the point of illegibility. Still, if you know the context and have a good grasp of puns and homonyms, you have a shot at figuring out what’s going on—and it’s a lot better than nothing.”

    Despite the difficulty generating highly accurate auto-captions, Harrenstien says he was confident from the beginning that YouTube’s automatic speech-recognition algorithms would improve over time and that the more auto-captions were used on the site, the more likely the company’s engineers would be given an opportunity to improve the technology. “It works as well as we can make it, and I love it for that reason,” he adds. “It is not perfect, does not pretend to be perfect, and may never be perfect, but it’s a stake in the cliff we’re continuing to climb.”

    The best way to improve auto-captioning accuracy in such a way that it can be used by the millions of videos on YouTube is to feed more data to a larger, richer model of spoken language, essentially training the YouTube software to better interpret spoken words and place them in context, Cohen says.”

    • thanks, Susan. I really appreciate the background on YouTube captions.
      I don’t agree that it’s better than nothing, personally, because they can be so bad that the content is misleading. For some things that, as he says, if you have a general grasp of the situation, you can figure it out, but for a lot of other things (mostly the things I actually need the captions for!), trying to figure it out on the basis of context and stuff is just too much for me.
      So, I just end up wondering… if the system only improves with use, but if using the system is an exercise in headache-inducement for people who use it, what’s the point? It’s like a painful catch-22…

I'm opinionated, friendly & chatty... I hope you are, too