Working With Computer Vision: Game Player - Step 1

I seem to be spending a great amount of time writing blog posts and not a huge amount of time actually progressing with the program. I think this is because a lot of my time so far has been spent capturing screen shots for the setting up portion of the system so now that that is over with things should proceed a bit more briskly.

So we've got our image data in C and we also have the ability to generate a window, let's combine the two so we can see our image from C and that way we can be certain that we do in fact have our image data and not just a jumble of non-sense.

To do this OpenCV needs to recreate the image from the data since just now all it is is an array of ints. In order to recreate the image it needs to know the size of the image in terms of its width and height so we will change our function to pass these in.

Code changes in Java:
Function's declaration:

static public native int[] calcMove(int[] pixelData, int width, int height);

Function's usage:

calcMove(pixelData, screen.getWidth(), screen.getHeight());

Code change in C header (gameplayer_GamePlayer.h):

JNIEXPORT jintArray JNICALL Java_gameplayer_GamePlayer_calcMove
  (JNIEnv *, jclass, jintArray, jint, jint);

Code change in C implementation (gameplayer_GamePlayer.cpp):

JNIEXPORT jintArray JNICALL Java_gameplayer_GamePlayer_calcMove
(JNIEnv *env, jclass obj, jintArray data, jint width, jint height) {

Now that we've passed in the sizes we can go ahead and create a Mat in OpenCV which is basically a large array used for storing image data. A lot of OpenCV's functions expect the image data to be in this structure so getting it in early is a good idea. You create a Mat like so:

Mat screen(height, width, CV_8UC4, elem);

You should be able to tell what each of these arguments are except perhaps the 'CV_8UC4'. You can read more about it in the OpenCV API but a quick explanation is we are using 8 Unsigned bits per pixel, hence the 8U, and we have 4 Channels, hence the 4C. This way C knows we need 8 * 4 bits bits per pixel, i.e. 32 bits as we talked about in a previous post!

We can then draw this image to the window we created earlier (although I've changed its name to 'Screen') by adding this code:

namedWindow("Screen", CV_WINDOW_AUTOSIZE);
imshow("Screen", screen);

Image recreated and displayed in C

Pretty neat, eh? Compile and run the program and you'll get the same output.

Of course we aren't really interested in the entire screen, just a portion of it, the game board, so we need to work out where this is. Now if we were always going to run our system on the same computer using the same display and in the same browser we could work out the co-ordinates of the game board on screen pretty easily using Java's Robot class and then hard code them in. However, I am keen to make this system as robust as possible and able to run even if the game board appears in a different location so what we will do is work out the boards position on screen from within our project. This is going to be the first 'clever' part of our system and I hope you are as excited as I am to develop this part.

Looking at our image we can see that the game developer has actually been really kind to us and has circled our game board (ok, squared it) with a nice thick purple box. If we can detect where that purple box is on screen then we can tell where the game board is. And wouldn't you know OpenCV has a couple of functions that will let us do just that!

The game board with a nice, helpful purple border

First we want to know the actual colour of the purple box round the board as an RGB colour code. I know we know it is purple but sadly 'purple' isn't descriptive enough for the computer. We could do this with Java but there is a handy, free program called ColorPic which you can use to get the colour of any pixel on your desktop and if you find yourself doing this often I highly recommend getting it. So if you want to, download that program and use it to find out the value for the purple in the screen. If you don't want to download it then just accept my word that it is R:220 G:031 B:228 (You may get a different value depending on what pixel you click on but it should be high, low, high value.)

ColorPic from www.iconico.com

Now we can use that value in OpenCV's inRange method.

What this method does is takes as input an array and checks if each of the values in the array fall within a specified upper and lower bound. If they do a corresponding value in the return array is set to white, if not it is set to black. What we want to do is pass in our image to this function and set our upper and lower limits around the purple colour we found so that everything else is filtered out of the image.

If you wanted you could set both the upper and lower limit to be the exact value we found but you'll find that in reality that means hardly any pixels get through as there us often a small amount of variance between pixels that appear to be the same colour to the eye. With systems like this that use drawn graphics it may not be such an issue but if you ever do any processing on images from cameras you'll notice the problem. To compensate for this I recommend expanding the limits up and down a bit to allow for a wider range of 'purples' to come through the filter. This will also work well since you can plainly see that the border isn't one solid shade of purple. You can expand them by more than I have if you want but if you expand too much you'll start letting through other colours which will confuse the system. Trial and error gets the balance right.

Right, let's have a look at the code for the inRange method. In the call I pass the image I am filtering, I define my limits, first the lower then the upper (notice I'm also accounting for the alpha channel since it is recorded in my array), and then I pass in a new Mat which I declared before hand to store the results in. Here it is:

//Filter for purple (B,G,R,A)
Mat purpleOnly;
inRange(screen, Scalar(150, 0, 150, 0), Scalar(255, 50, 255, 255), purpleOnly);

We can then show this image by changing the image displayed in our window:

namedWindow("Screen", CV_WINDOW_AUTOSIZE);
imshow("Screen", purpleOnly);

Compile the C code and return to the Java project because we want to make a slight change there before we run it again. You probably noticed earlier that we need time to run our program and then change the window to the game window so that the right screen is detected. Add in a little sleep to the start of the project of around 5 seconds so that you have time to run the project and then switch to the game window.

Thread.sleep(5000);

Even better at this stage is to just capture a screenshot of the game board (or use the one I posted above) and switch to that after running the project, this saves waiting for the game to load and also (if you actually ever play the game) saves you racking up the losses in your statistics if you don't go on to play the game afterwards.

If all has went well you should be able to run the Java program and switch to the game board window with just a few seconds passing before you are presented with this:

The results of the purple filter

Superb, we really couldn't have asked for a better result (Ok it could have been a complete square if those blue 'gates' weren't in the game but this is still pretty damn good!) Don't worry if your output looks like this:

A completely black image, don't worry chances are there was just no purple on the screen

It probably means you didn't give yourself a big enough delay to switch windows. Increase the delay and try again. If it still appears black try increasing the limits on the values being passed to inRange to see if it detects anything. If it doesn't, make sure you are taking account of the alpha channel and setting it to 0 for lower limit and 255 for upper. If it still doesn't work then let me know in the comments. I can't guarantee I can help but maybe another reader can.

So we have the outline of the game board but we still don't really know where it is. I mean we can see it, but the computer doesn't know where it is. What we need is a point for where the top corner is or even better a box encapsulating the whole thing. Again OpenCV has functions that will do just that.

I'm going to get a box encapsulating the game board as this will be a lot more useful to us in the future. If we only wanted to find the top left most point of the image we could simply iterate through the image data array we created and look for the first non-zero value. A little bit of dividing on the offset against the height and width of the image and we would have our X and Y point.

The problem with this is it only works if we get a perfect outline in the first place with no 'noisy' pixels (pixels that the system has picked up that we didn't want it to).

Instead we will go with the more robust 'Contours' method. What this involves is drawing a polygon round all the white 'blobs' on the screen and using the bounding boxes from those to give us our encapsulating box which we will call the Region of Interest (ROI for short). The great thing about working out this region of interest is that once we have done it once we don't need to do it again as, provided the game board doesn't move on the screen whilst the system is running, it will always be in the same place. This means once we introduce a loop into our system that repeatedly processes the screen, we only have to process the region we are interested in (are you seeing why it's called the ROI now?) which can potentially save us a lot of processing and make our system much faster (something important in real-time processing like this). Ok, on with the show!

Generally when you detect an area of interest in an image but haven't done any processing on it you refer to it as a 'blob' because just now that is all it is; a white blob on a black image. We have 4 of these that we are interested in just now, the 4 'broken pieces' of our square, and quite a few that we aren't, anything else that was purple on the screen. To draw the bounding boxes round the parts we are interested in we first have to draw a polygon round every part we detected so we can rule them in or out. To do this we call this function on our image:

vector<vector<Point> > contours;
vector<Vec4i> hierarchy;
findContours(purpleOnly, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE, Point(0, 0));

And that gives us the variable 'contours' which is full of the points of the smallest shape possible round each blob. We can see these shapes drawn on our image and we'll add the routine to do that shortly but first we'll declare a new image to draw them on and then work out the bounding boxes of the shapes, draw them on and combine them.

Rect bounds;
Mat drawing = Mat::zeros(purpleOnly.size(), CV_8UC3);
int j = 0;
for (int i = 0; i < contours.size(); i++) {
   if (arcLength(contours[i], true) > 500){
       Rect temp = boundingRect(contours[i]);
       rectangle(drawing, temp, Scalar(255, 0, 0), 2, 8);
       if (j == 0) {
           bounds = temp;
       } else {
           bounds = bounds | temp;
       }
       j++;
   }
}

Here we declare a Mat and set it all to black. Notice we've dropped the alpha channel here so we use CV_8UC3 instead. We then iterate round each contour and make sure the perimeter of each contour is long enough for us to be interested in using the arcLength method. If it is long enough we work out a bounding rectangle of the contour and then draw that onto our image using the rectangle routine which takes the image, rectangle, colour, thickness and type. Finally if this is the first rectangle we've processed we just set it to our final rectangle and if it isn't we combine it with our final rectangle using the | operator. Once this routine is done we have a rectangle representing our game board!

Let's draw a few other things onto the image so we can see some parts of the process more clearly.

After the above routine add this code:

//Draw contours
for (int i = 0; i < contours.size(); i++) {
    Scalar color = Scalar(0, 255, 0);
    drawContours(drawing, contours, i, color, 2, 8, hierarchy, 0, Point());
    }
rectangle(drawing, bounds, Scalar(0, 0, 255), 2, 8);

This will draw each of the contours we worked out earlier in Green which will match exactly the purple areas of the source image, our final game board rectangle in Blue and we have the bounding rectangles of the large white blobs on the same image, named 'drawing', which we drew on earlier in Red. We can display this using:

namedWindow("Contours", CV_WINDOW_AUTOSIZE);
imshow("Contours", drawing);

One finally thing we will do is make the ROI image out of our main image and display it. Here is how:

Mat roi(screen, bounds);

and

namedWindow("ROI", CV_WINDOW_AUTOSIZE);
imshow("ROI", roi);

We can tell where the game board is on screen by checking the x and y values of the bounding rectangle we made so let's add a little print out to tell us that as well:

cout << "Board starts at: (" << bounds.x << "," << bounds.y << ")\n";
fflush(stdout);

Compile everything and run the code and this should be you're finally output (give or take a few windows depending on whether you replace each window call or added extras):

The "Screen" window showing the results of our purple detection

The "Contours" image with contours (i.e. the bits we detected as purple) in green,
large contour bounding boxes in blue and the result of combining all these
bounding boxes in red.

The "ROI" window showing only the game board

You can see it is really coming together now and with just a little more work we will be able to work out where each game piece is and simulate mouse clicks into any position in the game board we want regardless of where it is on the screen.

Stay tuned to find out how!

Oh, as an addendum I've added in a check to make sure the count of j is greater than 0 before making any use of the variable 'bounds' because if it isn't it means bounds won't be initiated and our program will crash. You'll see this in this zip file containing all the source files for the project so far. There is also some functions in there commented out that you might be interested in playing with, try uncommenting them and displaying the output in a window to see what they do.

Working With Computer Vision

Labels

Wednesday, 26 September 2012

Game Player - Step 1 - Finding the Game Board

No comments:

Post a Comment