Code Style:

Introduction to OpenCV

If you think of every smartphone as a computer, then the majority of computers today have a built-in camera. But how much software uses that camera? Not much.

In this week's tutorial, we will learn a bit about the OpenCV library for computer graphics. OpenCV is a huge library, with tons of features. I'm by no means qualified to teach all of OpenCV. But I have used it in the past (see, for example, this website), so it's not a complete stretch to teach a little OpenCV.

We're not going to focus on camera capture during this tutorial. It's not all that hard. Instead, we will focus on what you do once you've got an image. In the grand scheme of things, as long as you can process images fast enough, you can think of your camera as just an infinite stream of image files... Your computer vision algorithm could simplified to (a) process an image, (b) act on the result, and (c) move on to the next image.

That being the case, let's focus on how to manipulate images in OpenCV. For simplicity, we're going to work in Java. The main benefit of using Java is that we can use a good IDE and enjoy its auto-completion features. Without an IDE, you're going to be spending a lot of time in your browser typing queries like "OpenCV Java Matrix import" to find the right things to import, and then browsing lots of JavaDoc to figure out what methods are available.

For the record, I succeeded in doing this whole tutorial in Java on Windows 8 (64-bit), executing everything from a Git Bash (cygwin) terminal. As far as I can tell, the OpenCV installers make Windows the easiest platform. But I've certainly been wrong before.

The first step is to install OpenCV. You can find instructions at the OpenCV website. Since we're ultimately working in Java, your ultimate goal is to get a working installation.

If your installation went well, you should have chosen some location for an OpenCV folder. Within it, there should be a "build" folder, within which you should be able to find a "java" folder. This should have the file "opencv-2410.jar" in it, as well as an "x64" folder with "opencv_java2410.dll" in it.

The path to OpenCV is important for you to remember. The OpenCV website's examples all assume you'll provide it on the command line whenever compiling or running code.

Our examples today all use ant. I'm not an ant expert, so the ant buildfiles in this tutorial are all roughly identical to what appeared in the OpenCV tutorials I studied before building this tutorial. Here's our standard build file. You should name it "build.xml".

Hopefully the comments in the file make sense. It will compile all java files in the entire tree starting at src/. However, it expects there to be a .java file named Hello, and Hello is expected to have a public static void main(String[] args) function.

Now let's go ahead and write something just slightly more complex than "hello world". In OpenCV, the matrix data type is central to almost everything we do. Let's make sure we can create a matrix, initialize it, and change some of its values. Save this file as src/Hello.java:

It's time to compile and run this file. We can't just run ant, though, because ant needs some information about where our OpenCV jar and OpenCV native libraries are. Let's suppose that you downloaded and installed OpenCV to the parent of the current folder. In that case, you'd type:

That's pretty cumbersome, so I decided to make a script file. I added $1 to the end of the line, so that I can pass parameters (like clean) by typing ./buildit.sh clean.

Now you should be able to run the program. Your output should look something like this:

Now let's look at how to work with image files.

OpenCV makes it very easy to do some powerful things. To support this claim, let's look at how to do facial recognition using OpenCV.

The first thing we'll need is a filter file. You can think of this as the result of some AI algorithm having been trained on a large corpus of faces. We'll use lbpcascade_frontalface.xml, which comes with OpenCV. Download it and save it in a convenient place. We'll also need some images to work with. There is a tradition in Computer Vision, where images of Lena Soderberg are used. Here's a picture of her. However, I prefer using this picture of a famous band from the 1970s. Note that one of the above images is a jpeg, the other is a png. OpenCV handles png by default. If you're on Linux, you might need to install a plug-in before OpenCV can deal with jpegs.

We can find faces in an image using code like this:

Note that I'm taking a very strict approach toward identifying file resources. In examples online, the resources are often embedded in a jar. Taking this route, with explicit filenames, is much better for our long-term goal of working with arbitrary images that arrive after the jar is created. It also works more cleanly (there are lots of forum posts about trouble loading images with OpenCV when the images are in a jar).

It turns out that cropping in OpenCV is very easy. All we need to do is create a Rectangle that describes the range of pixels to select, and then we make a new Matrix using an existing Matrix and the Rectangle.

Just for fun, this code makes a separate png file for each rectangle it finds.

Resizing images in OpenCV is equally straightforward. While there are many algorithms for getting a good resizing, the easiest is to just use the Imgproc.resize() function, and provide it with a Size object:

In the example above, I'm resizing each face to 1000x1000 pixels. It should be easy to see how to make an image smaller or larger, and also how to stretch images in one direction, using different values for s.

When we create an Image from a file via the Highgui.imread() function, each element of the Matrix will represent a color intensity. When loading a png, this typically means the Matrix will be a red-green-blue image (BGR in OpenCV, for blue-green-red).

We can use Imgproc.cvtColor() to convert an image from one intensity format to another. This lets us, for example, convert from BGR to grayscale:

In the example above, I'm cropping an then converting. To convert image directly, we would replace both instances of cropped with image in the last line. We could also convert directly from the old image to the new. To do so, first create a new Mat(), then use image as the first (source) parameter, and the new Mat as the second (destination) parameter in the call to cvtColor.

Now let's go ahead and put a few of these concepts together. We're going to find as many faces as we can in an image, and then for each rectangular face region, we're going to change the input image so that the region is grayscale.

There are a few new OpenCV functions in this code. Let's go over them briefly. First, we use the "submat" method to get a submatrix of the original matrix. When we do this, it's a shallow copy: we didn't actually copy the pixels, we just have references to them. That's going to be important later on. When we call cvtColor on the submatrix to make it grayscale, OpenCV will create a new matrix and copy the result into it. So after the first cvtColor, the sub matrix no longer refers to the same pixels as image. Then we convert back to BGR. This is an important step: image is in BGR mode, and we can't write 8-bit grayscale pixels into the positions of a 24-bit BGR matrix. When we convert from grayscale to BGR, there's no magic to re-create the original colors... we're just making the 8-bit values into 24-bit values. Finally, we use copyTo to copy back into the original matrix. We use submat again here, to overwrite the pixels we originally extracted.

This is a rather simple demonstration, but it's good to know for when you need to change formats. Many file formats in OpenCV support extra features. The most obvious is the level of compression/loss of quality when saving a jpeg. The default value is 95%, but we can change it:

It's best to think of MatOfInt as an array of integers. We can populate it using a variable argument function (fromArray). The format is that the array should hold the ID of a configuration option, then a value, then another ID, then another value, and so forth. In our case, we only use the jpeq quality ID, and we use different values. If you look at the resulting images, you should see quite a difference.

When using OpenCV, it's a good idea to start by looking to see if there's a way to do what you want done. For example, you could blend two images by iterating through their points and manually computing a mixture of the BGR values at each point. However, that code wouldn't generalize to other formats (such as 8-bit grayscale). More importantly, it wouldn't be efficient. OpenCV tries to use your GPU, or at least native C code, whenever possible. Why write slow, clunky Java code when OpenCV already does the work, and quickly?

Most of the above code should make sense. Given two images of the same dimensions, we can multiply each by a weight, sum them, and save the result. Note that the 5th parameter is an optional scalar to add to each element of the output array.

We can render text on top of an image, too. Given that you already understand blending images, you probably have a good idea about how you could make a "watermark" text, once you have a Mat with some text in it.

There are a few different putText functions. You should check the documentation for more details. In particular, OpenCV doesn't give many font options by default, and those that are available are determined by an integral value.

This is going to be a bit trickier. For starters, there's a bit of theory involved in doing edge detection, which you can read about here. Once that makes sense, doing edge detection is as simple as computing a few gradients and then combining them. To make our work easier, we'll first blur the image, then create the gradients, then take their absolute values, and then add the gradients together. Adding the gradients is a bit imprecise, but it should be good enough.

You'll need a few more imports than before:

And then when we write the code, note that we're using quite a few of the Imgproc features that we've explored before.

Make sure you take a look at the documentation for the sobel operation. You should also look into the Scharr operator, which gives better edge detection.

Earlier, I suggested that you shouldn't directly manipulate pixels if you can find a function to do the job for you. In this example, we quickly look at how to do direct manipulation, just in case you need it.

In this example, we set 1/4 of the pixels in the image to black, by iterating through the Matrix, pixel by pixel, and setting some to be (0, 0, 0). Note that the image.get() call is completely unnecessary. It's just to show that you can get a pixel value.

The first "next step" is to create a circular filter. Your goal should be to extract a circle from one image, and superimpose it onto another image. If you want a hint to get started, look here.

The second "next step" is to create a different filter. In the above examples, I often took advantage of a tutorial from here, here, or here, and transliterated it to Java from C++. Now it's your turn.

The third "next step" is to create an extension of some sort for Node.js, so that you can invoke code that runs an OpenCV filter on images that are uploaded from the web.