- US - English
- China - 简体中文
- India - English
- Japan - 日本語
- Malaysia - English
- Singapore - English
- Taiwan – 繁體中文
Quick Links
Computer vision (CV) is a transformative field that equips machines with the ability to interpret and understand visual data, mimicking human sight. By leveraging deep learning and convolutional neural networks (CNNs), computer vision enables systems to recognize, classify and analyze images and videos with high precision.
This innovative technology has varied applications across industries, enhancing computers with advanced human like capabilities. From healthcare and semiautonomous vehicles to security systems and retail, computer vision is revolutionizing how machines interact with and interpret the visual world.
What is computer vision?
Computer vision definition: Computer vision is a branch of computer science that trains machines to interpret and understand visual data from images and videos.
The term “computer vision” reflects the goal of enabling computers to “see” and comprehend images and videos, with the capability to identify objects within them and categorize what they observe. This technology relies heavily on artificial intelligence to power these systems.
Computer vision technology is particularly powerful because it automates the previously manual process of visual information processing. While machines have been capable of processing visual data for some time, computer vision significantly reduces the need for human intervention that older technologies require.
How does computer vision work?
Computer vision is driven by two primary technologies. The first is deep learning, a sophisticated form of machine learning that allows computer systems to develop and learn from the data they are provided and the tasks they perform. The second is convolutional neural networks (CNNs), a type of neural network that learns directly from the input data. CNNs are particularly effective for image and video analysis.
Computer vision systems receive visual data input, which is then processed by algorithms that train the computer to understand the content and interpret it similarly to a human. CNNs break down visible information, enabling the machine to understand images and videos.
By applying labels and tags to individual pixels and elements of the image, computer systems can process the information from labeled images and use CNNs to make predictions and decisions about the visual data. They process images through informational labels and then use neural networks to understand the image and draw conclusions, which are then provided as output.
What is the history of computer vision?
- 1960s, early experiments in computer vision: In the early 1960s, computer image scanning technology was launched and developed. With the increased capability of computers to process, digitize and transform images, along with the advent of artificial intelligence, computers began learning to understand images.
- 1970s, development of computer vision algorithms: By the next decade, computers could process images, leading to the development of algorithms that enabled them to visually understand the images they received. Initially these algorithms allowed computers to identify edges, but they rapidly evolved to identify objects and eventually add context. Computer vision algorithms were continually advancing.
- 1980s and 1990s, further evolution: Algorithms became more sophisticated, gaining specific capabilities, such as detecting curves, patterns and shapes. Considerable progress was made in 3D imagery, and by the late 1990s, computer vision was capable of image-based rendering, image morphing and more complex processes.
- 2000-2010, rise of real-time applications and standardization: The early 2000s saw significant advancements in real-time computer vision applications. The Viola-Jones face detection framework, introduced in 2001, enabled real-time face detection. This period also marked the standardization of visual datasets, such as the introduction of the ImageNet dataset in 2009, which became a crucial resource for training and benchmarking computer vision algorithms.
2010-current, deep learning revolution and beyond: The 2010s brought a revolution in computer vision with the advent of deep learning techniques. In 2012, the AlexNet model, which used CNNs, won the ImageNet Large Scale Visual Recognition Challenge, significantly outperforming previous methods. This era also saw the development of powerful frameworks like TensorFlow (2015) and PyTorch, which facilitated the implementation of complex neural networks. Recent advancements include the use of computer vision in semiautonomous vehicles, medical imaging and augmented reality.
In recent years, computer vision has advanced to where it can recognize and categorize elements within images with an exceedingly low error rate. This progress has been driven by improvements in deep learning and CNNs, enabling more accurate and efficient visual data processing.
What are key types of computer vision?
Computer vision enhances the ability of machines to replicate human behaviors. Similar to natural language processing, computer vision exemplifies how artificial intelligence and deep learning extend computers’ capabilities. Given the significant advancements enabled by this technology, the numerous and versatile applications for computer vision are no surprise.
- Image classification uses computer vision technology to categorize images. This use is helpful for technologies that rely on visual data, like social media platforms and applications. By defining and classifying image content, image classification ensures the right content is served to users.
- Computer vision models are trained to recognize and identify certain patterns in images, including facial recognition. With more complex and sophisticated versions of this technology, computers can distinguish individual faces from one another. Facial recognition is especially useful for unlocking devices.
- Object detection builds on the principles of image classification. While image classification enables computers to identify and categorize images, object detection technology allows them to pinpoint specific objects within images and videos. This technology is crucial for safety in transportation as it can identify blockers (objects that completely obstruct the path and require stopping), obstacles (objects that partially obstruct the path and require navigation around them) and intruders who may pose a danger.
How is computer vision used?
One of the most straightforward and recognizable applications of computer vision technology is image searching. While using images to search the internet with search engines is not a recent technology, it has undergone significant innovations over the last few years. Initially, machines could only recognize established images or basic elements, but search engines are now equipped with computer vision capabilities that can adaptably search based on intricate details of complex elements.
A key use case for computer vision is in security across various sectors and industries. For everyday security, computer vision can recognize unsafe objects and contraband. By expediting and automating safety and security processes, we can keep people safer, with computer vision ensuring that public places are free of dangerous items.
Similarly, computer vision can be applied to the medical industry. By using machine learning to identify symptoms and factors that may be missed by doctors, computer vision can bolster the diagnosis process. This technology can improve accuracy and efficiency in both diagnosis and treatment.
Micron has been leveraging computer vision technology as a crucial element of its AI-enabled manufacturing for years, significantly enhancing yield, quality and time to market. By employing smart sight (image and video analytics) at every stage of wafer manufacturing, Micron can detect microscopic flaws or defects that are invisible to the human eye.
This technology allows for immediate error detection, enabling engineers to intercept and correct defective wafers early on and preventing the production of additional flawed products. In short, the integration of AI and computer vision has improved accuracy and efficiency, allowing for faster product launches and increased productivity.
Another innovative use of computer vision that is not yet fully integrated into society is in self-driving vehicles. Autonomous vehicles rely on technology to identify hazards and obstacles on the road, and computer vision is crucial for this process. By enhancing the safety of these vehicles, computer vision can advance product development and drive greater adoption of autonomous driving.
Computer vision and deep learning are closely related fields, but they serve different purposes and have distinct characteristics. Computer vision is a field of AI that focuses on enabling computers to interpret and understand visual data from the world, such as images and videos.
Deep learning is a subset of machine learning that uses neural networks with many layers (hence “deep”) to model complex patterns in data. Deep learning is a technique used within computer vision (and other fields) to model complex patterns and improve performance.