Skip to content

We may need to talk

December 5, 2016


Can you:

  • Think faster than you can type?
  • Talk faster than you can type?

If that’s you, and you haven’t tried voice recognition, or not for a while, then maybe, now, you should.

(Please don’t worry about the odd occasion when you talk faster than you are thinking. We all do it. In my case, usually after wine.)

Here and now:

I am dictating this. Into my iPhone. About as fast as I am thinking it. (For some reason I find that voice recognition is faster and more accurate in iOS than in macOS Sierra. But any account of the relative capabilities of different softwares is likely to be overturned by the next iteration of the software. I suggest, start with what you have.)

Seeing my spoken words appear on the screen in front of me is still, for me, a mall kind of magic.

I was a fan of the idea of voice recognition before it was, frankly, much good. My efforts with early versions of Dragon were – well, disappointing. Training to my voice was slow; the quality of the voice capture was often poor; and it took its time transcribing.(A colleague claimed that he trained his Dragon installation by reading it Alice in Wonderland. Which explains a lot about his writing style. And, occasionally, his content.) I believe Dragon is a lot better now. As I am sure are the others – it’s a jungle out there, and evolution is rapid and brutal, but great for the customer.

Nowadays: no training. Straight in. Fast, or fast enough – almost as fast I speak, certainly as fast as I can speak clearly. Accurate enough most of the time to mean that, even allowing for the necessary editing, voice recognition is quicker than my (not very good) typing.

A caution:

Voice recognition, certainly the iOS version, tries very hard indeed to make sense of what it hears. Sometimes, this takes it a long way away from what you meant and said. The main practical implication for me is – review often. Because, if you get a few paragraphs ahead before you check, you simply may not be able to reconstruct what you originally meant / said. And that precious pearl may be lost forever.


But two things will happen over (quite a short period of) time.

  1. The system will get to know you and your speech better, thereby becoming more accurate. This is fairly obvious.
  2. More interesting, you will make the curious accommodations required to speak written English. I’ll say more about this.

Writing and speaking:

Crudely: writing and speaking are different kinds of language, different forms of expression.

Here’s a short test you might want to do. If you have access to voice recognition – and you almost certainly have, either on your smart phone or via Google – try it now.

  1. Turn on voice recognition and then talk about something you’re interested in for a couple of minutes, as if you were talking to a friend or colleague.
  2. Write / type for a couple of minutes on the same topic.
  3. Compare the transcript of your speaking with the words that you typed/wrote.

What do you notice?

Obviously, I don’t know. And the comparison isn’t simple:

  • You may have been so conscious of the fact that your spoken words were being transcribed that you spoke something much closer to written language than your more normal speech.
  • Or your writing for a colleague may have been much less formal than if you had been writing for a wider audience.

But it may be that;

  • Your transcribed speech took more words to say roughly the same thing than did your writing / typing;
  • Your transcribed speech was less formal, more conversational, than your written / typed text;
  • In particular your transcribed speech may have conformed less well to grammatical conventions, in particular to clear and conventional sentence structure and breaks between ideas or sentences.
  • Unlike most people, who generally talk mostly in phrases, you may already talk in complete sentences. Or even paragraphs. Or even chapters. Or even books. If you are one of these lucky, talented people, then voice recognition will be largely unproblematic for you, and you will simply become much more productive. I envy you.

Of course we all have a range of forms of both spoken and written expression. We can speak formally or informally, we can write formally or less formally. Audience and context make a big difference to how we speak and write.

But the fact remains that, for voice recognition to give you all the potential advantages of speed, you do need to learn to speak in something like the way or style in which you write, or want to write. And, as I am discovering as I dictate this paragraph, this is hard work, and requires fierce concentration – because, when you are using voice recognition, you are writing; in the specific sense of getting words onto the screen; faster perhaps than you could if you were typing.

But speaking into voice recognition slowly becomes easier. And then you will begin to see typing as it is – another form of technology-impeded human action. Like driving a car with manual gear change – “stick shift” to our American friends call it. Or indeed driving a car at all in the age of Google car and Tesla autopilot.Or repeatedly typing / speaking) the same information into form after form after form. Or – insert your own technological bête noire here.


I over-simplified earlier. I don’t think it’s as simple as learning to speak in the same kind of written English that you used to produce. I write Tweets, emails, blogs, articles and book chapters through the use of voice recognition software. Here are some things I have noticed:

  • My style has become slightly less formal as I have made the transition to voice.
  • My sentences initially became longer, sometimes I felt too long – at first I used to solve this problem in the editing, whereas now (as is not demonstrated in the current sentence!) I have taught myself to speak in short sentences again. When appropriate. Or when I remember.
  • On a good day my writing is a little more vivid – I’m less likely to censor short flights of imaginative expression in speech then I am in writing. Of course, if I don’t like these flights, or don’t think them appropriate to the intended outlet or audience, I can always cut them out. And I do. Sometimes with a tear. Following Faulkner’s advice, endorsed by Stephen King, to “kill all your darlings.”

But the gain in speed I achieve (when I dictated the words “I achieve” just now they were transcribed  as “hi cheese”, which briefly entertained me) through the use of voice recognition is so great that the necessary additional editing time still leaves me ahead.


What about the quality?

Quality, I feel, is as much a consequence of:

  • The research and thought and planning that go on before the thoughts are expressed either in speech or in writing, and
  • The process of editing

as it is a consequence of the process of committing thought to screen.

Although … some of my better quality ideas make it to the screen because I  capture them in speech, whereas I might well have lost them by the time my fingers caught up with my occasionally fleeting thoughts. This capture of what would otherwise have been lost just now happened, in the previous sentence. (The rather informal starting of a new paragraph with “Although” was undertaken as I dictated it, because I was conscious that the sentence was becoming rather long, but I didn’t want to lose the train of thought as I struggled to find an editing solution. Anyway, I want in this post to show how, now, for me, speech becomes writing.)


I prefer to use voice recognition when I am alone. I share an open plan office with my partner, and I still feel a little embarrassed when talking to a machine while Carole is in the room, although she assures me that, as someone familiar with open plan office working, she is bothered by it not at all. But that doesn’t seem to stop it bothering me!


Try voice recognition. Academics exhibit greater strangenesses than speaking, slightly after the manner of a BBC radio announcer from the 1930s, into a telephone or a computer. Pretend you are that rude person on the train, who shares (often) his enunciated thoughts with the carriage.

The learning curve for fluency with voice recognition is long but gentle. The benefits, including speed, start to show very early on.

And you may discover, as you proceed, that the computer isn’t the only one learning to recognise your speech. You may also become better able to recognise, appreciate, enjoy and improve your own various voices.

Let me know.

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: