Abhinai Srvistava is CEO and co-founder of Mashgin, a touchless self-checkout system powered by AI and computer vision.

Computer scientists have long dreamed of the possibilities of what artificial intelligence (AI) could help humankind achieve. Could we create machines that see and understand the world as we do? Computer vision is the subset within AI that processes visual data. 

Advances in AI — specifically deep learning and neural network innovations — have made it possible for computer vision to become as good at recognizing objects and patterns as the human eye. The human visual cognitive system is complex. Multi-dimensional data needs to be analyzed to decipher images, making it far more complex than most AI computers' input. Today, the applications of multi-dimensional data are as vast as Larry Roberts, the father of computer vision, predicted. 

Nearly 60 years later, the future of computer vision appears full of promise and the potential applications and outcome vast, with some even teetering on science fiction. At this moment, the technologies that power computer vision have finally begun to catch up to our dreams of its applications — self-driving cars, faster medical diagnosis, instantaneous checkout and more.

To create a real sense of what the future might hold, we need to first explore the evolution of computer vision and dive into its real-world applications today that are already improving people's daily lives. 

A Look At The Past

In 1959, the first digital image scanner transformed images into grids of numbers for computers to recognize them. A few years later, a discussion led by Lawrence "Larry" Roberts — widely considered the father of the internet and also the father of Computer Vision — explored the possibilities of extracting 3-D geometrical information from 2-D perspective views of blocks. Many researchers soon realized the need to find ways to identify images from the real world, and thus, research began to work toward low-level vision tasks that included segmentation and detection. Multiple frameworks emerged, such as methods of capturing and/or recording objects and recognition-by-components, which suggested the human eye can recognize objects by breaking them down into their main components. Thirty years later, Kunihiko Fukushima created the precursor to the modern convolutional neural network (CNN) called neocognition. CNN architectures are among the most used neural networks, giving rise to the popularity of deep learning networks. 

In recent years, we've seen other pivotal computer vision moments, including Facebook's use of facial recognition in 2010 and Google's launch of TensorFlow in 2015 and DeepMind's AlphaGo algorithm victory over a world Go champion in 2015.

Where Computer Vision Is Today

Computer vision has experienced a real boom. Today, it's the backbone of an autonomous future across many industry sectors, including transportation, healthcare, agriculture, retail, manufacturing and more. In May, Tesla announced it would transition entirely to Tesla Vision, a camera-based autopilot system, retiring radar. Zebra Medical Vision, a healthcare-focused computer vision organization, is set to be acquired by Nanox for computer vision tools that diagnose bone, liver, lung and cardiovascular diseases. In retail, it can be used to provide insight on consumer behavior that retailers can then leverage to design even better customer experiences. 

Historically, computer vision couldn't reach the level of speed and accuracy necessary to operate in specific environments, including self-checkout mechanisms in retail environments, which is what my company specialized in. While some have attempted to leverage countless resources in the cloud, which are over-reliant on limited retail internet connectivity, recent advances in both edge hardware and the thoughtful application of AI have made it possible to tackle this and other problems. 

The Promise Of The Future

Computer vision is starting to change society and the whole world as it becomes ubiquitous. Autonomous vehicles and other industries rely on this technology to increase human capacity. By applying techniques like 3-D reconstruction, current generation hardware can reach the necessary degree of accuracy and speed at the edge. 

Rather than augmenting humans, the focus needs to be on increasing human cognitive bandwidth by leveraging technology to analyze sensory input, particularly low-value input, which comes from mundane and repetitive tasks. Reaching the full potential of computer vision will be possible once we can transition from research labs into the real world. The next technological revolution will not be as extreme as what rendered great leaps in the last two decades; it will be more nuanced and human-centered. 

Today, without us even noticing, computer vision is already enhancing our lives. Face ID technology began by enabling people to unlock their phones, and now it's been adopted by mobile applications for services such as investment and banking accounts, which require a high level of security. Computer vision will be and already is essential to the development of the Fourth Industrial Revolution, as it continues to automate traditional manufacturing and industrial practices.

Present-day capabilities and applications are only scratching the surface of this technology's potential, which is nearly limitless. When it comes to embracing and implementing computer vision, the first step is identifying how it can serve people and make their lives easier — in behavior and form. From there, we can start to make headway.


Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?


Follow me on Twitter or LinkedInCheck out my website