Hey Siri, you’re stupid
Siri, do you obey the Three Laws of Robotics? Like many other silly questions, it’s one that someone at Apple has laboriously anticipated. “I forget the first three,” chirps the response, “but there’s a fourth: ‘A smart machine shall first consider which is more worth its while: to perform the given task or, instead, to figure some way out of it’.”
Ha ha! Imagine the meeting where they wrote that one! Trouble is, it’s not really a joke, is it? That’s their actual development brief.
Siri can do a lot of easy things, like sums, checking the weather forecast and sending emails. Demand anything harder and it either parries “I’m not sure I understand” or just does a web search for what you said. Well, whoop-de-doo. Singularity ahoy.
The rise of do-it-all bots – including Amazon’s Alexa, Google’s upcoming Assistant, and even a rival from Siri’s original developer, called Viv – throws an uncomfortable spotlight on iOS’s disembodied concierge. If, as is rumoured, it’s about to be announced as a tentpole feature of OS X 10.12 Sensemilla*, we can only hope a quantum leap in smarts is in the pipeline. Because it may have some wisecracks, but Siri hasn’t cracked being wise.
“But it works. Mostly. If you talk like a BBC announcer.”
It’s great that the technology exists to convert speech to text. Not so great that the technology apparently won’t fit in your phone, so it has to sit in North Carolina using up your data allowance. But it works. Mostly. If you talk like a BBC announcer.
Then that text can be mined for keywords to plug into one of Siri’s information sources, which include Apple Maps, Wolfram Alpha, Wikipedia and Microsoft Bing. (Fortunately you can swap Bing for a real search engine in Settings | Safari | Search Engine. Not by asking Siri, though. That would be too easy.)
But the ways Siri uses those sources are far from clever. When I open Maps and type ‘Whickham’ into the search box, it correctly finds a list of places within Whickham, a small town near where I live. Among these is the generic entry for Whickham itself, marked just beside the word ‘Whickham’ on Apple’s map. The typesetting is lovely, by the way.
I shouldn’t really be admiring typesetting while driving, so instead I say: “Hey Siri, navigate to Whickham.” Quick as a flash, Siri finds High Wycombe, Buckinghamshire. That’s the other end of England. It doesn’t ask me if this is the Wycombe I meant; it just starts plotting a route.
(This is not as daft as it gets. When I ask for directions to ‘Washington,’ another town in Tyne and Wear, it offers me Olympia, WA, 4,600 miles away. It’s not even the most likely wrong Washington.)
“No, Siri, Whickham, W-H-I-C-K-H-A-M.”
Now, there’s one entirely obvious way to resolve this confusion: you’d spell the word out. So I try: “No, Siri, Whickham, W-H-I-C-K-H-A-M.” It hears this perfectly well, and immediately announces it can find no matching places. Immediately, mind you – no going off to check sources. Just nope. Remember, Whickham is right there in Apple Maps, the same engine it’s using to show me High Wycombe.
The only way I eventually get to Whickham is by remembering a street name there (Front Street) and asking for it, then thinking of another way to ask for it after Siri shows me the wrong Front Street and, again, offers no alternatives. To add insult to injury, it’s picked the Front Street closest to my current location. Obviously.
Sorry, I don’t follow
Other types of query are handled with similar foolishness. “Siri, what’s the difference between Germany’s GDP and Italy’s?” No problem – it goes and gets the correct answer from Wolfram Alpha. Then I try a follow-up question. Siri used to be rubbish at follow-up questions, but bucked its ideas up after Google Voice Search came out and aced them. You can now ask “What will the weather be like on Monday?” and add “How about Tuesday?” and it’s fine. Or cloudy, as the case may be.
So I follow up with: “How about France?” It mishears “France” as “friends”. Of all the words I might have been saying, the fact that I’ve just named two other European states apparently adds no weight to the possibility that I meant “France”. World-leading interactive AI, folks.
I have an 88-year-old relative who’s very deaf. When you repeat something she’s misheard, she guesses a different word. She may still be wrong, but she doesn’t guess the same word again, because why would you have repeated it? I repeat “How about France?” three times. Three times, Siri returns “How about friends?” “Interesting question, Adam,” she responds, brightly.
Eventually I work out that I need to say “France” with a posh accent (I’m northern, so “France” rhymes with “pants”). Siri comes back with the phone number of Elizabeth France CBE, which is in my contacts from when she was the Data Protection Registrar. Because obviously I meant a random person whose office I last called 14 years ago, not the country. (The country that neighbours the two other countries I just mentioned.) And I often refer to people just by their surnames. That’s totally a thing. “Siri, what of Carruthers?”
The kind of logic that’s missing here shouldn’t be hard. It’s already there, in some respects. You can say – just out of the blue, à propos de rien – “Hey Siri, Ivy is my sister.” After checking which Ivy you mean, if you have more than one in your contacts, Siri will ask: “OK, do you want me to remember that Ivy is your sister?” Say “Yes,” and in future you can say “Call my sister” to call this contact.
Family relationships have standard interdependencies, so naturally you can then go on to specify “Oliver is Ivy’s son” and subsequently say “Siri, text my nephew” to send a message to this contact. It’s not rocket science, but it’s a nice… wait a minute. No. Siri’s doing a web search for “Oliver is Ivy’s son”. It can’t even get that far.
“What’s most infuriating about Siri is that, for a system designed to listen, it’s such a bad listener.”
What’s most infuriating about Siri is that, for a system designed to listen, it’s such a bad listener. And that’s more than just a bug. The unjustified insouciance, the flat refusal to contemplate its own ignorance, are inescapably symptomatic of Silicon Valley hubris.
Understanding human speech and the intentions behind it is an enormous challenge even for humans. As speakers, we stumble over our words; as listeners, we misinterpret them. Most of every conversation is gently requesting clarification or resolving ambiguity.
Yet Siri will have none of that. For all its conversational veneer, it’s just searching our input for the same barked commands we could find in a menu. Either it knows what we want, or it “doesn’t understand”, a passive-aggressive way of saying “Invalid input”. There’s nothing in between.
“That’s so Bay Area,” people say now, when some overprivileged tech mogul blunders into a complex human issue with an oversimplified solution to the wrong problem. We might as well say: “That’s so Siri.”
*Prove we’re wrong then