Here’s how deep learning helps computers detect objects

0

[ad_1]

Deep neural networks have gained fame for their capability to process visual information. And in the past few years, they have become a key component of many computer vision applications.

Among the key problems neural networks can solve is detecting and localizing objects in images. Object detection is used in many different domains, including autonomous driving, video surveillance, and healthcare.

In this post, I will briefly review the deep learning architectures that help computers detect objects.

Convolutional neural networks

One of the key components of most deep learning–based computer vision applications is the convolutional neural network (CNN). Invented in the 1980s by deep learning pioneer Yann LeCun, CNNs are a type of neural network that is efficient at capturing patterns in multidimensional spaces. This makes CNNs especially good for images, though they are used to process other types of data too. (To focus on visual data, we’ll consider our convolutional neural networks to be two-dimensional in this article.)

Every convolutional neural network is composed of one or several convolutional layers, a software component that extracts meaningful values from the input image. And every convolution layer is composed of several filters, square matrices that slide across the image and register the weighted sum of pixel values at different locations. Each filter has different values and extracts different features from the input image. The output of a convolution layer is a set of “feature maps.”

When stacked on top of each other, convolutional layers can detect a hierarchy of visual patterns. For instance, the lower layers will produce feature maps for vertical and horizontal edges, corners, and other simple patterns. The next layers can detect more complex patterns such as grids and circles. As you move deeper into the network, the layers will detect complicated objects such as cars, houses, trees, and people.