Technologies now help us do more, and remember more than ever before. Of course that isn't new. We have been doing this for years. How good are you at mental arithmetic (vs your calculator?) How many phone numbers can you actually remember? How many birthdays do you rely on Facebook remembering for you?
We continue to give agency - permission - to softwares run and owned by other organisations to act on our behalf, all in the pursuit of personal or professional efficiency and ease. There is an interesting link to the question of trust which I looked at in an earlier post, and a fascinating ethical dimension which is very well explored in this week's report from the RSA & DeepMind.
However, I want to come back to the important thoughts and underlying techs raised in that report separately and focus here instead on the techs in the "interaction layer" - and the remarkable visible strides being made with these systems of engagement to bring that agency even closer to our daily lives.
Avatars and visual "self"-representation
Avatars are already a decent sized business – but with VR and AR, as well as flat-space interactions on websites, having a personalised presence complete with a visual identity – entity even – might enhance the online or connected experience.
Facebook understands enough of what you like and dislike already. It recognises (perhaps better via its other properties in Instagram and Whatsapp) that appearing responsive - responding promptly - is often as important as the response itself. Would you elect to let it respond for you – in a range of emotions? It is already investigating full body responsive avatars that can do this (another step on the virtual reality journey they set out on when they bought Oculus).
Apple’s face-recognition technology that can not just unlock your iPhone but generate tailored animojis is an evolution of Snap’s masks and filters, which were themselves an iteration after Instagram’s filters.
The strides being made here are remarkable. Companies like Soul Machines (born out of New Zealand and Lord of The Rings) and Speech Graphics (born in Scotland and out of video games like Call of Duty and Gears of War) are creating extraordinary avatars that serve as customer facing agents. These take not only speech, context and data into account as inputs, but also your expressions - they can use your phone or laptop's camera to monitor how you are reacting to them:
Natwest have teamed up with Soul Machines and IBM Watson to develop their "digital human" agent, Cora - with some pretty impressive results - and the technology is moving quickly.
Virtual Voice Assistants
Google recently blew the tech world away with its demo of Google Duplex, its remarkably natural-sounding speech engine that is capable of making calls on your behalf. Ask Google Assistant to book an appointment at the hairdressers, and thy will shall be done:
Check out the full Google blog post here for more detail and a few more sound clips. Its pretty extraordinary - especially the ability to deal with the humms and haaas, elaborations and repetitions that pepper everyday speech.
Furthermore, Google even have John Legend's voice coming as an option for Google Assistant! To extend this thinking, it will likely soon be possible (if they productise their WaveNet tech for consumers) for us each to record our own samples, and generate our own voices to use in Assistant. For a slightly more primitive but accessible version, try uploading a sample to Lyrebird to see how with just a minute's sample you can get a decent representation of your own!
Microsoft Xiaoice (which currently only works in Asia) operates slightly differently in that it doesn't work the same way - making full calls on a user's behalf—rather, it maintains a back and forth conversation with a user:
As per this demo, Xiaoice's skills are focused on high emotional intelligence interactions, drawing on context, rather than on transactions. Microsoft's addition of Semantic Machines to their stable will enhance this capability further.
So the technology is closer than we might have appreciated - its already uncannily getting past the uncanny valley of good but not quite right. But we have some quite fundamental questions to ask ourselves about where we consider the boundaries to be with this kind of technology. Cultural norms, privacy standards, security, and some far reaching ethics questions cant be shirked - by any of us.
Thoughts as ever welcome :)
コメント