I know how you feel: how computer vision is changing everything

From improving cancer diagnosis to powering animated poo emojis, Jon Stubley tells us how his favourite computer applications are powering the future.

The world has been a flutter with the launch of the iPhone 8 and iPhone X in the past week. Working in the computer vision space, the fact that the (eye-wateringly expensive) iPhone X has facial recognition capabilities is of particular note.

Not least as it will drive awareness of computer vision technology – the tech that is driving its FaceID and Animoji (animated emojis which match your facial movements to create animals and of course the poo emoji, which talks, frowns and laughs).

Computer vision describes the ability of a digital device to take in visual data – photos, videos and even live action – and then actually make sense of it. It is already used for things you may have heard of, like photo library organisation, semi-autonomous cars and quality inspection on silicon chip factory assembly lines; along with powering the kind of in-image advertising we do.

However, it is also powering some innovative, exciting (and frankly weird) applications that you might not have come across yet.

Here are my current favourites:

 

I know how you feel

Disney has developed a network that can discern how audiences feel about a movie in 10 minutes or less, just by looking at pictures of their faces. Using four infrared cameras focused on moviegoers’ faces in a 400-seat theatre, Disney researchers captured around 16-million facial expressions over the course of 150 screenings. Using a technology known as factorised variational auto-encoders (FVAE), Disney’s program makes calculations based on whether a person is, say, laughing or looking scared at the right places, which can then be used for everything from test screenings to recommendation engines on Disney’s upcoming Netflix-like streaming video service.

In slightly-big brother mode, US retail giant Walmart is developing a facial recognition technology that analyses customer moods via cameras at checkout lines. If a shopper is annoyed because the cashier is too slow, the system will recognise distress, via facial recognition, and instantly notify other workers to come and help.

This type of tech is also being applied in the animal kingdom. British researchers trained a neural network on 500 pictures of sheep in pain and sheep feeling just fine. Honing in on tell-tale ‘pain’ signs, such as forward folding ears and squinting eyes, the Sheep Pain Facial Expression Scale learned
to predict whether a sheep is hurting just by looking at a picture of its face.

 

Mapping the future

Back to the new iPhone launch (although it won’t be confined to the new phone), Apple’s developer tool ARKit (its alternative to Google’s Project Tango) will be released as part of iOS 11. This will use the iPhone’s camera to map out the physical environment and create 3D maps of the world around us.

In a nutshell, it allows you to drop objects into an environment and move them around using the touchscreen. An obvious application is for interior design – you can play around with looking at what a chair would look like in a particular room from different angles.

 

Kicking cancer

Machines aren’t going to be replacing doctors but they are helping them more effectively diagnose patients. Studies by PathAI, which builds and trains computer vision models to help pathologists diagnose better, found that accuracy rates on breast cancer lymph node biopsies went from 85% to 99.5% when comparing human-only to human-plus- computer approaches.

PathAI’s machine learning system essentially highlights areas where it sees cancer across those 500 or so slides. PathAI co-founder and CEO said “It’s a dramatically faster procedure and now a pathologist’s job is really to correct or confirm the diagnosis provided by the AI system, then move on to the next case” at the recent LDV Vision Summit.

 

Flipping-out

In the US, burger chain Caliburger are developing ‘Flippy’, a fast-food robot that uses computer vision and AI to analyse burgers and determine when they need to be flipped over on the grill. Flippy then turns them over and finishes cooking them, although real people will still be needed to assemble them. Due to roll-out in 2018, it could mean the end of over-cooked burgers.

Staying in the world of food but bringing it back to the media and marketing world, Pinterest and MIT are working on technologies that simply look at a picture of a food or dish and automatically tell you how to cook it via recipe suggestions. Both start-ups and the big guys are working on in-fridge cameras that use image recognition to detect when food inside is about to expire (and send you notices on what you need to replace).

 

Lightning McQueen becomes real

This reference may go over your head if you don’t have kids and haven’t been forced to watch the wildly successful Disney Pixar Cars franchise a billion times (it’s about anthropomorphised cars competing in races). However, it is now a reality.

Startup Roborace and Formula E (competitive EV racing) are teaming up to develop self-driving race cars that can not only go 199 miles per hour, but also can actually learn on the track—the more obstacle course-like, the better—making the record-breaking potential exponential, to say the least.

Meanwhile, autonomous boats are already being tested in Boston Harbour by startup Sea Machines, which is working on self-control systems for all kinds of seacraft, from fire boats to cargo ships.

While some have more utility than others, it is clear that computer vision is a force for tremendous innovation. I’m looking forward to what comes next.

 

Jon Stubley is managing director ANZ at GumGum.

 

 

Image copyright: vadymvdrobot / 123RF Stock Photo

 

  • Bohyun Jung

    I suggest Google’s counterpart to the Apple ARKit is ARCore, which has been announced recently, likely to be the successor of project Tango mentioned above.