If Apple wanted to create a lively 3D avatar for Siri, they could, but they don’t for business reasons. Anything less than complete realism for Siri would make the creature appear uncanny and creepy, turning off consumers. It’s likely that Apple could even make Siri’s voice more natural, given that her tone hasn’t changed much since her release in 2011. But again, Apple would have to improve her naturalism by leaps and bounds to pass the uncanny valley and make Siri acceptable to consumers.
However, as individual AI domains increase in sophistication, a new opportunity will arise that involves mashing up disparate AI modules. For example, if Siri had a 3D face floating on the screen of a self-driving Tesla, the effect would be like talking to Kit, the automated assistant featured in 1980s television show, Knight Rider. Instead of creeping us out, the combination would be delightful, tickling our curiosity. “Siri, take me to the grocery store.” “Okay, would you like to stop by the bank along the way?” If you added some eyelashes and fleshy features, you might start to feel like you had a real-life shopping companion.
The final 20% of AI sophistication for voice assistants will probably cost the same as the first 80%. But, as we get closer, it will become easier for entrepreneurs to round up the current best-performers in AI and produce a symphony of life emulation.
Benchmarking AI to Brain Modules
A naive way to measure AI progress is to look at the size of the computers involved in AI. For example, the size of Deep Blue, which is the computer that beat the world chess champion Kasparov in 1997, was about the size of a filing cabinet. Meanwhile, the chess-playing module inside of our brain is maybe the size of a pea. The volume ratio between a pea and a filing cabinet is perhaps 1:50,000, which means that the AI for playing chess had roughly one-fifty-thousandth the efficiency of the human brain. However, within a decade, cell phones had enough processing power to play world-class chess.
Likewise, the size of Siri or Alexa today may take up an entire data center, which is about 1,000 filing cabinets, whereas our brain's language processing center (Wernicke's component and the Broca region) is around the size of a walnut. In a decade, it's possible that Siri or Alexa could exist inside a small module on our phones.
In the future, computer scientists will attempt to approximate the neocortex, which is the part of the brain associated with our high-level cognitive features, including emotional intelligence. The volume of the neocortex is perhaps the size of a tennis ball. It might take the whole world's processing power today to approximate one human's neocortex. But eventually, the computational needs will come down to the size of a data center, then a filing cabinet, and then ultimately a chip that could fit in the palm of our hands.
Each time we reach a milestone in artificial intelligence, it looks trivial in retrospect. For example, when Deep Blue beat chess champion Gary Kasparov, we wrote off chess as a rote puzzle. However, that doesn't necessarily mean we will always be unimpressed by every AI accomplishment. We have to properly define the most meaningful components of intelligence so that we know what mountains we've actually climbed.
The highest mountain that AI can climb will be in approximating the neocortex. The neocortex represents the largest increase in brain mass for humans relative to the Animal Kingdom. Since Nature is efficient, the size of the neocortex is an indication of the difficulty of its function. The neocortex is responsible for our higher-order cognition, such as mind-reading or empathy. It's through the neocortex that language recognition becomes language understanding. In other words, the most interesting goal-post in AI will be social.
The current AI goal-posts will seem trivial in retrospect. Multiple-object recognition or self-driving cars probably represent a small part of our brains. Even having an AI that passes the Turing Test, won't be that impressive. After all, small-talk is ultimately inane. But the ability to navigate political alliances won't seem trivial.
A social AI would require the ability to ask and give favors. A social AI would have to know who to avoid or who to approach. It will have to detect liars, and it will have to learn how to trust. The amount of processing power needed to empathize with a group of friends and figure out which restaurant to go to will require a lot more processing prowess than merely counting moves in chess.
We'll know we're making progress on Artificial General Intelligence when we start comparing AIs to 5-year-olds
Right now, artificial intelligence (AI) is about as smart as a two-year-old. This argument is based on two hard AI problems that we've solved. One is pathfinding. Thanks to DARPA, we can now throw a car into the desert and tell it to go from A to B, and it'll fumble its way there. Likewise, a two-year-old could waddle, however imperfectly, to get from one part of the house to another. The other solved problem is object identification. Thanks to MIT researchers, we can put a picture in front of a camera and an AI will tell you what the scene is about. Likewise, if you flipped a picture book in front of a two-year-old, they'd point and shout "apple!" or "fire truck!"
So it's 2017, and we have a two-year-old. Not bad. The question is, Can we get to a 5-year-old? A 5-year-old doesn't just have two impressive skills, but maybe five. Not only can they get from A to B, but they can find a tool they haven't used before and start toying with it. Not only can they identify single objects, but they can describe a group of objects on a table. This task is exponentially harder than single-object-identification because a bowl of fruits is many things. Not only is it a bowl, but it's the individual fruits within it, as well as breakfast. Achieving multi-object-identification is frequently called a Holy Grail for AI.
The final Holy Grail is artificial general intelligence (AGI). AGI is AI smart enough to make itself smarter, which if achieved, would be the end of the human era on Earth. An AGI has roughly the intelligence of a 10-year-old since that's about the age when some children can begin coding. However, given the jump in the number of Hard AI problems we would have to solve to go from a 2-year-old to a 5-year-old, we'd probably have to solve 30 Hard AI problems to match the brain of a preadolescent. None of this is to say that it's impossible to build AGI, but given how long it's taken to solve just 2 Hard AI problems, AGI is certainly not "around the corner."
We should be able to prove the materiality of consciousness soon thanks to AI
We'll find out soon enough whether Daniel Dennet was right, that there is nothing special about consciousness. As we crawl up the artificial intelligence ladder, we should start to see basic forms of consciousness in our machines. We have a funny feeling of consciousness when we look at Deep Blue defeating Kasparov at chess. Or when we imagine the grid computing behind Siri, we feel that some intelligence is at work. But so far, our sense of a "ghost in the machine" isn't the same as watching a squirrel pause and scan its surroundings. We feel that there is some kind of consciousness in the squirrel, albeit primitive, even if it's not ours. Or take even the simplest creature, a starfish. When we poke it, we sense some kind of consciousness when it curls up. We know that it felt something as if its reaction was it saying, "ouch." So if we're indeed making progress towards a generalized artificial intelligence, we should be able to poke at our machines and sense a similar reaction.