Speech Recognition Options for Computer Access

Podcast Transcription

Hello and welcome to this podcast. I’m Mike Marotta from the Assistive Technology Center at Advancing Opportunities.

This interview was recorded during the 2011 Texas Assistive Technology Network Statewide Conference in Houston.

The title of this session is Speech Recognition Options for Computer Access. The presenter is Ed Hitchcock from the Rehabilitation Institute of Chicago. The session description reads: Students with a variety of upper extremity motor deficits may benefit from speech recognition for more efficient text entry. Additionally, speech recognition can be used for hands-free access to the computer. Strategies for training students will be reviewed. Students with voice and vision deficits will also be addressed.

Mike Marotta (MM): Hey Ed, how are you. Thanks for talking with me today.

Ed Hitchcock (EH): Not a problem.

MM: You did a session on speech recognition at the conference. What do you think is the most important ability someone must have to be successful with speech recognition?

EH: Well there is probably the things I would highlight are, obviously someone has to have speech. Part of the point of my session was that you did not have to have perfect speech. Part of the point of my session was that even people with some mild dysarthria, some vent dependencies, some breath support issues can still be successful with Dragon or with the speech recognition program but it becomes a value judgment. Am I more efficient with Dragon than I am with say keyboarding with my one hand or whenever other abilities I have. Somebody needs to have a fair amount of tolerance and patience with technology.

MM: I think anybody who is use that will be shaking their head right now!

EH: Exactly. A large part of my session was talking about managing people’s expectations. I use the example of the advertising for probably the main speech recognition program leads you pretty seriously to believe that you will be typing at 120 words per minute with 99% accuracy after 10 min. training.

MM: Right. With your feet up on the table and smiling.

EH: Exactly. I think they shoot themselves in the foot because most people who have tried it do not experience that kind of work. But nonetheless, it can be a valuable tool for some people. I said in my session, I don’t want to sit here and talk it down for three hours or I wouldn’t be here. I do use it and I am successful with it and I have clients who are successful with it but it’s important to manage those expectations.

MM: Yes – I think that really is – you hit it right on the head with that – is the key. It’s almost more than how good your speech is, it’s all those other things you were just talking about. It’s almost more important. The program has gotten good enough, the Dragon program, to recognize those other speech patterns. That’s very true.

When you go about training someone to be a user of this technology, how do you build their skill set up to be the most effective uses they can be with this?

EH: The big thing that I try to focus on initially is making sure that people understand how to dictate because the process of dictation is very different then the process of talking. If you listen to the way I’m talking right now, I’m putting some thought into how I would speak if I was dictating. If you compare that to how I was speaking in the first part of this podcast, you hear that there are differences in the way that we talk and there is a way to Dragon or speech recognition app will want to hear your voice versus just the way that we always talk. The way if we are just having a conversation. ”What are you doing for lunch? What’s going on with you?” That’s very different than dictating to create written text. Following that, those dictation skills, which by no means is a natural skill for most people especially not students I try to make sure that people have a good understanding of how to apply the correction strategies that are involved in the program. If people don’t get that it will not be a successful experience.

MM: Regardless of their voice or speech patterns.

EH: Regardless of their voice or what have you.

MM: Right. It doesn’t matter. That’s true. Can you share with us the story of someone that you have used this with what you found this to be very successful with?

EH: Sure. I can talk about, I’ll call her Kim. Kim was somebody who used to use Dragon Dictate way back in the day. I think around version 9 – I want to say it was version 7, 8 or 9 somewhere in that area, she tried NaturallySpeaking. She had a mild dysarthria, she had some pretty serious breath support issues, her typing was very slow – she could only type with one finger on one hand and the motion was very slow. So she did word prediction, she did abbreviation expansion, she did all the other things and she was very proficient. But she pretty quickly realized that even Dragon at 60% or 50% accuracy was faster for her than her own typing. That is kind of why I say it becomes a value judgment on the part of the user and what the task is. Kim was still in school at that point and one of the things that was interesting with her, I’ll be honest, she was one of the ones that educated me. I looked at her accuracy and said Kim there’s no way you want to use this program. And she was like no I do. It was an eye-opener for me just in terms of that value judgment. But the interesting thing that happened with her we really worked with her on using the keyboard and mouse to accomplish the correction strategies. So she didn’t have to use her voice that was not being recognized well to correct her text. And it wasn’t a big increase in time over what she would’ve had to do to just type out the thing in the first place. We got her pretty good with the correction strategies. I can knowledge that she was above average in maturity, above average willingness to put up with this nutty program and nutty therapist! But she is somebody who ultimately got much better accuracy and she actually started hitting much closer to 80 or 90% accuracy and it really turned into a functional tool for her. It’s interesting – she is now out of school she got a new computer and all she wants to do is surf the web so she doesn’t use Dragon anymore and that’s fine. But it got her through school. She is not in a position right now where she needs to produce a lot of text so she doesn’t use it. Again, the value judgment on the part of the client with the student becomes key.

MM: It is very true. That’s excellent – that’s a good point. With the idea, I know were speaking a lot about Dragon NaturallySpeaking as a third-party add-in software, what are your impressions of the built-in software that does speech recognition on Windows 7 machines and the Mac OS?

EH: The Windows 7 machine is quite a capable program. I have had actually quite good success with it. I wouldn’t say that it is so good that I would use it instead of Dragon. But it is good. It takes a little longer to train and I don’t think the accuracy is quite as good. Which is the end of the day is – that really does become important when you’re using it to produce something. You don’t want to have to constantly be correcting your text instead of thinking about what you’re actually trying to produce. So that becomes pretty key but nonetheless it is a decent program. You have to buy a microphone for it and I know I have had people who have been fine with a three dollar microphone from RadioShack but I’ve also had people with the $200 Sennheiser that worked fine. So microphones is a whole other story. But when all was said and done, I can buy Dragon Standard which does basically everything Windows 7 does – I can buy it on Amazon for about $30 it probably is worth it just to go with Dragon.

MM: It does make sense. I was always amazed, I saw in a Staples advertisement once that it was $59 and then they give you a $50 rebate, so Dragon was basically nine dollars when you were done. So you can’t go wrong there.

EH: Exactly. It is not something, it’s not like it’s that big of investment to go with Dragon versus the program. Listen, in a couple of years it’s been a while since they’ve released a new version of the Windows speech recognition – if they get better with the accuracy I would probably change my tune. The Mac OS I don’t have as much experience with. I’ve had some people that are able to use it fairly well for command and control of the computer and to be clear you’re talking about the OS Not the MacSpeech Dictate or the Dragon Dictate 2.0. The inboard operating system it’s better for command-and-control I’ve had people with good voices that kind of know what they’re doing on the computer to my knowledge doesn’t really do dictation per se and that’s where becomes for most of my users they have to have some way of doing dictation so what’s the point. Speech recognition on the Mac seems a little bit of an afterthought at this point.

MM: Not nearly as integrated as the Windows 7 is now.

EH: Right. Or even comparing Dragon to the DragonDictate for the Mac. It’s not anywhere near as well put together.

MM: Not yet. How about – with the iPads becoming so prevalent and everybody wanting an iPad and not quite sure why they want it but just knowing they want it – what’s your impression of the DragonDictate app for the iPad?

EH: It is nice. I Personally Use an Android phone just because I didn’t have an iPhone through the carrier – to my knowledge you can put in novel text which is kind of a drawback. But if you’re in a place of the good network signal, in a place where if you don’t mind that you can’t per se train it. Right now I’m looking at your last name it would do fine for Mike it would not do so well for Marotta.

MM: it does not do too well at all.

EH: It does well for Hitchcock because of my less famous relative! But it doesn’t do as well with my wife’s maiden name which is Feder, which is spelled F E D E R. You’re never going to get it to recognize that kind of thing well. So as long as you understand the limitations of it, it can be very nice and I’ve been pleasantly surprised at the accuracy.

MM: I have been surprised as well with people I’ve used it with. Very surprised. If you look at speech recognition as an area of technology, looking ahead could you picture something in the future – what do you think the next breakthrough in this area of technology will be?

EH: My guess is that they are going to continue to work on speaker independence. That is what would really be nice. The training process has its advantages obviously but integrating Dragon – which I can get to recognize Marotta no problem – getting that kind of functionality integrated into a speaker independent system which by the way still allows for nonstandard voices, whether it be accents or breath support or dialect or what have you – that’s where I think it would be going. A little bit more speaker independence which would be nice to see.

MM: That would be great. Now if people have more questions how can they get in touch with you?

EH: Best way to reach me is e-mail – I am much more an e-mail person that I am a phone – but it is ehitchcock@ric.org . Particularly for those people who were my sessions that I continually seem to run over for, if you want to put Houston in the subject line within the next 60 or 90 days I will make it a real point to get back to you quickly where I might not otherwise!

MM: Excellent. Well Ed thank you very much.

EH: You are very welcome.

 

Thanks for listening to this podcast. For more information about the Texas Assistive Technology Network, visit the website at www.texasat.net

For more information about the Assistive Technology Center at Advancing Opportunities, visit the website at www.assistivetechnologycenter.org

The music used in this podcast is by Kevin MacLeod and is used with permission under the Creative Commons License 3.0

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s