We may need to talk
Start:
Can you:
- Think faster than you can type?
- Talk faster than you can type?
If that’s you, and you haven’t tried voice recognition, or not for a while, then maybe, now, you should.
(Please don’t worry about the odd occasion when you talk faster than you are thinking. We all do it. In my case, usually after wine.)
Here and now:
I am dictating this. Into my iPhone. About as fast as I am thinking it. (For some reason I find that voice recognition is faster and more accurate in iOS than in macOS Sierra. But any account of the relative capabilities of different softwares is likely to be overturned by the next iteration of the software. I suggest, start with what you have.)
Seeing my spoken words appear on the screen in front of me is still, for me, a mall kind of magic.
I was a fan of the idea of voice recognition before it was, frankly, much good. My efforts with early versions of Dragon were – well, disappointing. Training to my voice was slow; the quality of the voice capture was often poor; and it took its time transcribing.(A colleague claimed that he trained his Dragon installation by reading it Alice in Wonderland. Which explains a lot about his writing style. And, occasionally, his content.) I believe Dragon is a lot better now. As I am sure are the others – it’s a jungle out there, and evolution is rapid and brutal, but great for the customer.
Nowadays: no training. Straight in. Fast, or fast enough – almost as fast I speak, certainly as fast as I can speak clearly. Accurate enough most of the time to mean that, even allowing for the necessary editing, voice recognition is quicker than my (not very good) typing.
A caution:
Voice recognition, certainly the iOS version, tries very hard indeed to make sense of what it hears. Sometimes, this takes it a long way away from what you meant and said. The main practical implication for me is – review often. Because, if you get a few paragraphs ahead before you check, you simply may not be able to reconstruct what you originally meant / said. And that precious pearl may be lost forever.
Learning:
But two things will happen over (quite a short period of) time.
- The system will get to know you and your speech better, thereby becoming more accurate. This is fairly obvious.
- More interesting, you will make the curious accommodations required to speak written English. I’ll say more about this.
Writing and speaking:
Crudely: writing and speaking are different kinds of language, different forms of expression.
Here’s a short test you might want to do. If you have access to voice recognition – and you almost certainly have, either on your smart phone or via Google – try it now.
- Turn on voice recognition and then talk about something you’re interested in for a couple of minutes, as if you were talking to a friend or colleague.
- Write / type for a couple of minutes on the same topic.
- Compare the transcript of your speaking with the words that you typed/wrote.
What do you notice?
Obviously, I don’t know. And the comparison isn’t simple:
- You may have been so conscious of the fact that your spoken words were being transcribed that you spoke something much closer to written language than your more normal speech.
- Or your writing for a colleague may have been much less formal than if you had been writing for a wider audience.
But it may be that;
- Your transcribed speech took more words to say roughly the same thing than did your writing / typing;
- Your transcribed speech was less formal, more conversational, than your written / typed text;
- In particular your transcribed speech may have conformed less well to grammatical conventions, in particular to clear and conventional sentence structure and breaks between ideas or sentences.
- Unlike most people, who generally talk mostly in phrases, you may already talk in complete sentences. Or even paragraphs. Or even chapters. Or even books. If you are one of these lucky, talented people, then voice recognition will be largely unproblematic for you, and you will simply become much more productive. I envy you.
Of course we all have a range of forms of both spoken and written expression. We can speak formally or informally, we can write formally or less formally. Audience and context make a big difference to how we speak and write.
But the fact remains that, for voice recognition to give you all the potential advantages of speed, you do need to learn to speak in something like the way or style in which you write, or want to write. And, as I am discovering as I dictate this paragraph, this is hard work, and requires fierce concentration – because, when you are using voice recognition, you are writing; in the specific sense of getting words onto the screen; faster perhaps than you could if you were typing.
But speaking into voice recognition slowly becomes easier. And then you will begin to see typing as it is – another form of technology-impeded human action. Like driving a car with manual gear change – “stick shift” to our American friends call it. Or indeed driving a car at all in the age of Google car and Tesla autopilot.Or repeatedly typing / speaking) the same information into form after form after form. Or – insert your own technological bête noire here.
Voices:
I over-simplified earlier. I don’t think it’s as simple as learning to speak in the same kind of written English that you used to produce. I write Tweets, emails, blogs, articles and book chapters through the use of voice recognition software. Here are some things I have noticed:
- My style has become slightly less formal as I have made the transition to voice.
- My sentences initially became longer, sometimes I felt too long – at first I used to solve this problem in the editing, whereas now (as is not demonstrated in the current sentence!) I have taught myself to speak in short sentences again. When appropriate. Or when I remember.
- On a good day my writing is a little more vivid – I’m less likely to censor short flights of imaginative expression in speech then I am in writing. Of course, if I don’t like these flights, or don’t think them appropriate to the intended outlet or audience, I can always cut them out. And I do. Sometimes with a tear. Following Faulkner’s advice, endorsed by Stephen King, to “kill all your darlings.”
But the gain in speed I achieve (when I dictated the words “I achieve” just now they were transcribed as “hi cheese”, which briefly entertained me) through the use of voice recognition is so great that the necessary additional editing time still leaves me ahead.
Quality:
What about the quality?
Quality, I feel, is as much a consequence of:
- The research and thought and planning that go on before the thoughts are expressed either in speech or in writing, and
- The process of editing
as it is a consequence of the process of committing thought to screen.
Although … some of my better quality ideas make it to the screen because I capture them in speech, whereas I might well have lost them by the time my fingers caught up with my occasionally fleeting thoughts. This capture of what would otherwise have been lost just now happened, in the previous sentence. (The rather informal starting of a new paragraph with “Although” was undertaken as I dictated it, because I was conscious that the sentence was becoming rather long, but I didn’t want to lose the train of thought as I struggled to find an editing solution. Anyway, I want in this post to show how, now, for me, speech becomes writing.)
Environment:
I prefer to use voice recognition when I am alone. I share an open plan office with my partner, and I still feel a little embarrassed when talking to a machine while Carole is in the room, although she assures me that, as someone familiar with open plan office working, she is bothered by it not at all. But that doesn’t seem to stop it bothering me!
End:
Try voice recognition. Academics exhibit greater strangenesses than speaking, slightly after the manner of a BBC radio announcer from the 1930s, into a telephone or a computer. Pretend you are that rude person on the train, who shares (often) his enunciated thoughts with the carriage.
The learning curve for fluency with voice recognition is long but gentle. The benefits, including speed, start to show very early on.
And you may discover, as you proceed, that the computer isn’t the only one learning to recognise your speech. You may also become better able to recognise, appreciate, enjoy and improve your own various voices.
Let me know.
I’ve tried using dictation, but so far without success. Like you, way back when I tried Dragon – in my case catalysed due to computer overuse injury – but I gave up quickly because my eyes (and body) got tired having to keep such a sharp eye on the screen. As a fast touch typist…
now re-visiting the idea (again catalysed by computer overuse issues) – I’m a Mac user; all seems to be changing. I tried Dragon a few years ago when I broke my wrist, but never got it going well – felt I needed more support which I couldn’t locate. And moved back to touchtyping. Dragon will apparently no longer work with the Mac with Catalina, and / but other dictation seems much improved.
I am aware of some OU colleagues using dictation very effectively for marking tma’s – but not sure quite how they got to this state, or how much they still need to use their fingers as well.
Interesting points on how speaking to type may impact on how you express yourself. And vice versa – eg learning to speak in shorter, clearer sentences.
mmm
People also for example use standing desks more often these days – and I wonder whether we think the same way standing as sitting. I’m not sure I can – but may explore this further. It also depends on whether print/ paper or only computer/ online stuff is being used, I think.
actually, I’m also interested in how dictation may support someone who is very deaf. So, if we spoke and the words got written down in large type this could help with conversation. However, it seems to me it requires recognising the issue, and being willing to engage with it. It can be slow to get things going. And it is not just the deaf person, but friends and colleagues who need to engage with it to have any hope of helping with communication. Otherwise, it can be so isolating. It seems to me that support for disability – whether computer overuse, or deafness – is often very crude. It would be great to make it more sensitive.
Catherine
I am glad you found the post interesting.
The voice recognition built into Mac works well, especially with a headset microphone to cut out background noise.
The balance shifts if you are fast typist. I understand that.
I have not thought sufficiently about its possible applications for people with disabilities.
For me, voice recognition is just another tool. It is one that I have embraced. I am dictating this response. I am glad I experimented with it. Perhaps, the wider range of tools we use, or at any rate play with, the more likely we are to find the combination that works for us.
Also, I still find that there is something gently magical about my spoken words appearing more or less accurately on a screen.
Thank you again.
David