Object detection involves identifying the presence and location of objects in an image or video. It is accomplished using machine learning algorithms that are trained on a large dataset of labeled images.
Run the code
Once you’ve SSH’d your car, type the following commands on your terminal (pressing ‘enter’ after each input):
cd vilib
cd examples
sudo python3 objects_detection.py
On your terminal, you will see a link after “Running on:” Copy and paste that link on your browser to see the camera feed. The example below uses http://192.168.1.94:9000/mjpg
If you can, keep both windows open (split-screen) so you see the terminal page on one side and the car’s camera feed on the other, like this:
There are several approaches to object detection, but a common one is called the “sliding window” approach. A sliding window is a fixed rectangular region that “slides” across an image. The algorithm applies an image classifier to each window to determine whether it contains an object of interest; in this case, a face.
Another approach to object detection is to use a convolutional neural network (CNN). It is composed of multiple layers of interconnected “neurons,” which process and analyze the input data. Just like the brain consists of billions of neurons, CNNs also have neurons arranged in a specific way.
This method is also used to train AI to learn and play your favorite games. See for yourself with this Tetris AI Bot. Notice how the computer will not know what to do at first. Once you press “LOAD” at the bottom right to load the data set, the computer will play Tetris with ease.
So how do we get this data? TensorFlow is a software library for machine learning that can be used to design, train, and deploy CNNs. TensorFlow provides a number of pre-trained CNN models that can be fine-tuned for specific tasks, such as image classification or object detection.
The algorithm on your raspberry pi is utilizing this library for object detection. Take a look!