Thursday, 31 December 2020

Where Are The Toys?

For the "Tidy the Toys" challenge we need to be able to locate the different coloured blocks so we can move towards them, pick them up and take them to the target zone. Actually we're also planning to use another colour to identify the target zone. The only practical way of doing this is to use a camera and process the image to find the coloured blocks. 

Fortunately, one reason we didn't apply for Pi Wars 2020 was because I wanted to spend some time looking at computer vision. Originally I was going to look at OpenCV to do some image processing, but Brian Starkey's talk on "Computer Vision From Scratch" at the Pi Wars 2019 Mini Conference made me realise that an image is just a 2d array of numbers, and it should be possible to do something myself. 

So I developed "BlockyVision". This consists of a class library that allows blocks of colours to be found in an image, as well as a GUI that shows what the image processing is doing. 

I talked a little about this in a previous post, but this post gives a more detail about how it works. 

Here's an example of the GUI showing the original image, a pre processed image, and, finally, the original image again with the located colour blocks shown. 

BlockyVision GUI Showing blocks


The GUI can either receive images from a server, i.e. the software on the Robot or load an image from a file. The client runs the same image processing code as the robot, so, in theory the blocks shown on the GUI should match what the robot is finding. The GUI is written in python using the tkinter library so can run on a laptop or another Raspberry Pi.  

The image processing is fairly simple. The block finder is given a list of blocks to find, together with the Red, Blue and Green (RGB) values for matching the colour. It also has a colour threshold (shown by the slider on the GUI) which is used to show how far away from the RGB values given that will still produce a match. 

The first pass goes through all the pixels in the image and, if they match one of the blocks we're looking for, marks it with that colour. Originally, I also turned all the other pixels black - so the middle image showed only the block colours matched - but this proved too slow - I now use an internal boolean array to determine whether a pixel has matched or not. 

To compare colours, I'm actually using the RGB colour map and using Euclidean distances to determine whether the pixel colour is within the threshold for a block we're looking for. I did try using HSV values, but didn't get quite as good results. I should probably look at other colour maps, but for now this works, and I don't have time to go down a different route at this stage. 

The second pass then goes through the pixels matched, and looks around for other pixels to produce contiguous blocks. The biggest block found is the one we select as the matching block. There is also a minimum block size so we don't pick up spurious matching pixels.

The block finder class then returns a list of blocks, each one having the centre co-ordinates and the height and width. 

Performance is not great - on a Pi 3 I'm processing a 128 x 96 image with in about 200ms - so I'm getting around 4 frames per second consistently. This is not ideal, but enough to attempt the challenge. 

My philosophy is to get things working well enough to proceed to the next stage of a challenge. I can always come back and improve it later if I need to - and have any time remaining.



1 comment:

  1. Again nice work and nice to see people sharing info :-) I have been working on the same challenge but have used OpenCV and HSV to identify the coloured toys. I have also used OpenCV doe the distance finding. Next task is to use OpenCV to locate a couple of Aruco Markers that will mark the RH / LH side of the target zone,

    ReplyDelete