top of page
  • Xuehao Liu

Background knowledge of CNN

Updated: Jan 31, 2019

Convolutional Neural Network(CNN) is a powerful tool to process the image.

First it is accurate. This is the link to a page showing the accuracies on the classification task of small datasets. You can see the error rate of MINST dataset is down to 0.21%! In the most famous image classification challenge, ImageNet, the recent error rate is 3.1%, and its task is 1000 class problem. I can tell you this error rate is smaller than human being! The performance of CNNs is better than human.

Second it is fast. You can do the classification work in real-time. Using the normal laptop can process the image in a speed of 5 fps. As long as the network is trained, youcan do amazing works just using the GPU in your laptop with a high speed. So right now in china, the high speed rail station is using live face recognation for checking every person who is trying ot enter the station.

 

It is fast and accurate. So how does it work?

Before we start, this is a brilliant website visualizing how CNNs works.

The idea of CNN is basing on the "filters". A filter is a matrix. It is a calculator. When an image is multiplied by a filter, the certain part of the image will be activated. This is nonsense. No one can understand it. Let's do some experiments(all of these are using python):

So this is just a small small step forward. But the meaning of this is that we can pick the right filter for recognize right pattern. With certain part of the image areas are activated, we will be able to let the machine "recognize" the objects. It will like this:

Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and understanding convolutional networks. In European conference on computer vision (pp. 818-833). Springer, Cham.

You can see this is how CNNs "understand" and "recognize" the shape on images. Actually these are the features extracted by the higher layers in a CNN. A lower layer, which is the one closer to the raw image will only extract the lower features, such as lines and corners. A higher layer , which is the one closer to the actual output of a CNN, may be able to extract a more abstract(higher) feature, such as a head of a dog or a face of a man, as you see above.


With these features that extracted by a CNN, we can complete some amazing works by manipulate these features.









8 views0 comments

Recent Posts

See All
bottom of page