History of Computer Vision - Part 1

[It is such a long time since I wrote a dedicated technical post about computer vision.]


Since Computer Vision is an interdisciplinary research which involves different fields such as physics, biology, computer science, neuroscience, etc. For this reason, there are different approaches when people consider about the history of computer science. In this post, I summarize two viewpoints of Prof. Fei-Fei Li, and Prof. Richard Szeliski from their courses [1] and textbooks [1]. In the early period of the research, computer vision is considered as a subfield of Artificial Intelligence. In the 1950s, the major goal of Artificial Intelligence is to create a framework which is able to simulate the brain mechanism of humans. As regards Computer Vision, majority research problems at that time is also to find a system which is able to understand and manipulate visual inputs as the human cognition system.

Why and how can visual systems precisely observe and interpret a scane which is received from eyes? This question requires us study the machenism of visual system from organisms and the development of the system throughtout evolution. There are many prominent discoveries in many fields engaging when scienctists were trying to find the answer.

543M BC

The Cambrian explosion, source: Papermasters

This is the Big Bang of evolution since prior to this particular time, there are few origanims existing in the planet. And suddenly, a huge number of living beings appears on Eather. Based on archaeological evidences, there are several hypothesises are proposed in order to explain the phenomenon. Perhaps a meteor brought those livings, or perhaps there were tremendous changes in weather, enviroments which transform the whole biology on the planet. [These assumptions are extensively discussed in Cosmos, Carl Sagan].

Biologists called the phenomenon as the Cambrian explosion, in a short period when most major animal appeared in the fossil record. One assumption to reason the phenomenon is that there emergence of visual system on animals. At the first time in history, organisms no longer were susceptible to surroundings since the visual system help them flexibly interact with environemnts ad well as avoid the predators. Thank to these systems, animals are able to survive, navigate, and manipulate. These activities later are considered as the most important problems in artificial visual system, i.e, to create a system that can survive, navigate, and manipulate surrounds1.

which I will talk about in another post. discussion this

16th Century

Camera Obscura and a painter. Nguồn: unknown.

Throughout history, mandkind progressively invented devices which are able to capture a scene. There are many inventions around the world, e.g, in China, Egyt and other ancient civilizations. However, util the 16th century, Leonardo da Vinci is the first man who obtains rigorous records which discuss how these devices work in detail. Later, these mystery devices are called “cameras”.

Ghi chép của Leonardo da Vince về mắt người.

In his documents, Leonardo used the pinhole principal to model the machenic by which a camera can capture an scene. Sometimes, it is also referred as camera obscura. In addition, he applied this principal in order to understand how human eyes function. [This book describes extensively Leonardo’s incredible works.]

In this particular period, there are many inventions of the camera. Nowadays, cameras beocme one of the most popular device around the world. As in 2016, the number of visual sensors are more than the world population. However, these inventions as well as Leonardo’s discovers just end up copying the visual environment and cameras still could not understand neither interpret what they received.

In the next couple of posts, I will discuss the progress of computer vision in the modern era, especially 1950-1970 when Artificial Intelligence was sheer active in research community.

  1. It is quite similiar to another area of machine learning: Reinforcement Learning, ^
comments powered by Disqus