4 - The software side (Computer Vision) - Schools Plus at The Co-operative Academy of Stoke-on-Trent

4 – The software side

Up to now we’ve focussed mainly on the hardware side of the system and even though a lot of embedded programs doesn’t need a computer to run, this system is slightly different. For want of a better analogy, let’s compare the system to a car. The hardware would be the wheels, the brakes, the engine, but without a nice cosy cab and a steering wheel and instrument panel, we wouldn’t get far. So let’s talk about that.

The pan-tilt sub-system houses the webcam that picks up the images and feeds them to the computer for analysis. The computer then tells the hardware system to move the servos and then the loop is repeated. For the Arduino to communicate with the computer, serial communication is used via USB. This communications concept is one that I found pretty easy to understand, but complex to implement properly, so I think it warrants its own blogg entry.

In this project, Processing is used to create a very simple control software program. I have to warn you that there are no frills to the program that we’ll be describing next. It is simply a screen that displays the video fed from the webcam.

Before I go into the code and what every bit does, we need to look at what is needed to run the processing program:

– OpenCV for processing was utilised

– Processing 2.x

OpenCV is a set of computer vision libraries that have been developed by many and are available to many, free of charge. When I explained to my mentor at university what I wanted to build at the beginning of the year, he told me that to develop the vision algorithms that are required to distinguish between the selected object and the rest of the video frame, took many people many years. This is the reason for accessing the existing wealth of information and example code available on the net on this subject.

The code below is the exact code that I used in my program:

First, we need to import the correct libraries

import gab.opencv.*;
import processing.video.*;
import java.awt.*;
import processing.serial.*;

Capture video; //initialise video capture
OpenCV opencv; //Initialise OpenCV
Serial myPort; // Create object from Serial class

//Variables used in object detection
int z=0, k=0, count =0;
int rectMidx, rectMidy, x=90, y=90, x1, y1, xin, yin, xdiff, ydiff;
//Variables used in serial communication
boolean readyToReceive = false, readSerial = false, dataReceived = false;
boolean received = false;
boolean manual = false;

byte detect=0x0, b;
String mystring = “”;

//The Setup function runs once
void setup()
{
int wide, high;
byte c[]= {‘0’, ‘0’};
String[] cameras = Capture.list(); //create a string array to store the available camera addresses in

wide = 1280;//1280; //middle of the screen is 640
high = 960;//960; //middle of the screen is 480

size(wide, high); //Declare the size of the video screen
video = new Capture(this, wide/2, high/2); //start the video capture
opencv = new OpenCV(this, wide/2, high/2); //start opencv
/*load the relevant cascade file – in this case face tracking. A different cascade
file can be used to track other objects.*/
opencv.loadCascade(OpenCV.CASCADE_FRONTALFACE);
video.start();
String portName = Serial.list()[0]; //list the available ports
myPort = new Serial(this, portName, 57600); //set serial communication parameters
/*The following snippet is used as a handshake function with the Arduino board*/
while (myPort.available ()<=0) //Loop while the port is closed
{
delay(10);
}
if (myPort.available()>0) //when the port is open
{
//Arduino will be sending a ‘T”\n’every 300ms
myPort.readBytesUntil(‘\n’, c);
if (c[0] == ‘T’)
{
println();
println(“Arduino ready to receive”);
println();
myPort.write(‘A’); //write an A back to Arduino
myPort.clear(); //clear the port of all data
readyToReceive = true;
}
}
}
/*The Draw function loops continually*/
void draw()
{
int i=0, j;
int screenMidx = 320, screenMidy=240; //the middle position of the screen is captured in variables
byte c[] = {
‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘
}; //a byte array is created to store serial data
boolean change = false, closeToLimit = false;
boolean faceDetected = false;
/*If the object midpoint is within this threshold of the screen mid point
no movement data will be sent to Arduino – aids stability*/
int moveThreshold = 15;
int borderThreshold = 100;

scale(2); //middle of the screen is now 320, 240 due to the scaling of the screen
opencv.loadImage(video);

image(video, 0, 0 );
/*These parameters define certain attributes for the rectangle that is drawn around a detected object*/
noFill();
stroke(0, 255, 0);
strokeWeight(3);
Rectangle[] faces = opencv.detect();
Rectangle bigFace = new Rectangle();

if (faces.length >1)
{
for (i=0; i<faces.length;i++)
{
if ((faces[i].height*faces[i].width) > (bigFace.height*bigFace.width))
{
bigFace = faces[i];
faceDetected = true;
}
}
}
else if (faces.length >0)
{
bigFace = faces[0];
faceDetected = true;
}
else
{
detect = 0x0;
faceDetected = false;
}

if (faceDetected) //if a face has been detected
{
detect =0x01;
rect(bigFace.x, bigFace.y, bigFace.width, bigFace.height); //draw the rectangle around the face

rectMidx = bigFace.x + round(bigFace.width/2); //determine rectangle midpoint
rectMidy = bigFace.y + round(bigFace.height/2); //determine rectangle midpoint
xdiff = 0-round((rectMidx – screenMidx)*0.2813); //0.2813calc how far the rect has moved in x
ydiff = round((rectMidy – screenMidy)*0.375); //0.375calc how far the rect has moved in y
//println(“xdiff:”,xdiff);

if (bigFace.x <= borderThreshold || (bigFace.x+bigFace.width + borderThreshold)>=(screenMidx*2))
{
closeToLimit = true;
//println(“closeToLimit is TRUE for x”);
}
if (bigFace.y <= borderThreshold || (bigFace.y+bigFace.height+borderThreshold) >=(screenMidy*2))
{
closeToLimit = true;
//println(“closeToLimit is TRUE for y”);
}

if (manual ==false)
{
/*The if statements allow the camera to move gradually in order to centre the object on the screen
without polling the program. Polling is not ideal in a program utilising serial communication.*/
if (ydiff >moveThreshold)
{
if (closeToLimit == false)
{
y = y-1; //y = new y location of rect mid
}
else
{
y = y-2;
}
}
if (ydiff <-moveThreshold)
{
if (closeToLimit ==false)
{
y = y+1; //y = new y location of rect mid
}
else
{
y = y+2;
}
}
if (xdiff >moveThreshold)
{
if (closeToLimit == false)
{
x = x-1; //y = new y location of rect mid
}
else
{
x = x-2;
}
}
if (xdiff <-moveThreshold)
{
if (closeToLimit ==false)
{
x = x+1; //y = new y location of rect mid
}
else
{
x = x+2;
}
}
}
LimitCheck();
k=k+1;
}
println(“x:”, x, “\ty:”, y, “\tfaces:”, faces.length, “\tR2R:”, readyToReceive);

if (x!=x1 || y!=y1) //check if x and y values have changed
{
change = true;
//println(“change = true”);
}
else
{
change = false;
}

if (myPort.available()>0) //check if the port is open
{ //receive data
//println(“Port is open for reading”);
myPort.readBytesUntil(‘\n’, c);
myPort.clear();
//println(c);
if (c[0]==’T’)
{
manual = false;
//println(“‘T’ sent by Arduino – readyToReceive = true”);
readyToReceive = true;
c[0]=c[1]=c[2]=c[3]=c[4]=’ ‘;
if (detect >0x0) //if a face has been detected
{
myPort.clear();
myPort.write(‘A’); //Send the data bytes, starting the stream with ‘A’
myPort.write(x);
myPort.write(y);
myPort.write(‘\n’);
//myPort.write(”);
//myPort.clear();
readyToReceive = false;
println(“Data sent:”, ‘A’, x, y);
}
}
if (c[0]==’M’)
{
println(“Manual mode activated”);
manual = true;
if (c[1]<0)
{
x1 = x;
x = 256 + c[1];
count = 1;
}
else
{
x1 = x;
x = c[1];
count = 1;
}
if (c[2]<0)
{
y1 = y;
y = 256 + c[2];
count = 1;
}
else
{
y1 = y;
y = c[2];
count = 1;
}
println(“x: “, x, “y: “, y);
c[0]=c[1]=c[2]=c[3]=’ ‘;
myPort.write(‘T’);
myPort.write(‘\n’);
}
}

/*Copy coordinates into old value variables for comparison*/
// x1= x;
// y1 = y;
}
//This function causes the camera to feed a picture to the program
void captureEvent(Capture c)
{
c.read();
}

void LimitCheck()
{/*LimitCheck is used to ensure that no values outside the limits of the servos are sent to Arduino
The current limits have been tested and are correct. The smaller limits in the y direction is for practical purposes.*/
int xUpLimit = 179;
int xLowLimit = 3;
int yUpLimit = 149;
int yLowLimit = 31;

if (x>=xUpLimit)
{
x = xUpLimit;
}
if (x<xLowLimit)
{
x=xLowLimit;
}
if (y>yUpLimit)
{
y = yUpLimit;
}
if (y<yLowLimit)
{
y = yLowLimit;
}
}

void delay(int delay)
{/*This function is not currently utilised in the program, but may
be used as a simple delay function at a later stage.*/
int time = millis();
while (millis () – time <= delay);
}

The code is commented, so I shall not explain every bit of code, but just certain principles that may seem a bit tricky.

if (faces.length >1) //here we want to determine which face is the biggest. We choose to track the biggest face, because that is naturally the closest face to the camera and we can only track one.

{
for (i=0; i<faces.length;i++)
{
if ((faces[i].height*faces[i].width) > (bigFace.height*bigFace.width))
{
bigFace = faces[i];
faceDetected = true;
}
}
}
else if (faces.length >0)
{
bigFace = faces[0];
faceDetected = true;
}
else
{
detect = 0x0;
faceDetected = false;
}

In moving the servos to track the face movement, there are several ways to do this. Initially I simply fed the x and y coordinates directly to the Arduino, but this made the system very jumpy and it hardly tracked movement. The approach was then adapted. We have the screen/camera’s mid point. We can calculate the mid point of the rectangle drawn around the detected face. If those two points are not in exactly the same place, we calculate the number of pixels between them. We then increase the x and y values by 1 per loop for the relative axis so that the servo moves slowly until the mid points match. This is the principle anyway.

In reality if the servo angle is changed by one degree when someone is 2 metres away, the location of the face in the frame will change by many pixels. The result of this is an oscillating motion in the pan-tilt system. To avoid this, a threshold is used. If the rectangle around the face moves more than 15 pixels, then move the servo.

On the other hand, if someone is running, they’ll just run out of the screen and the system will be too slow to track them. So if the rectangle comes close to the edge of the frame, we want to increase the amount of movement that we apply to keep up with the image that is moving out of frame. We then apply an increase of 2 per loop. This can of course be adjusted to create more or less movement in the pan-tilt system.