How to code a nice user-guided foreground extraction algorithm? (Addendum)

After writing my last article on the Easy (user-guided) foreground extraction algorithm, I realized that you could maybe think I was exaggeratedly arguing the whole algorithm can be re-coded very quickly from scratch. After all, I’ve just illustrated how the things work by using G’MIC command lines, but there is already a lot of image processing algorithms implemented in G’MIC, so it is somehow a biased demonstration. So let me just give you the corresponding C++ code for the algorithm. Here again, you may find I cheat a little bit, because I use some functions of a C++ image processing library (CImg) that actually doe most of the hard implementation work for me.

But:

  1. You can use this library too in your own code. The CImg Library I use here works on multiple platforms and has a very permissive license, so you probably don’t have to re-code the key image processing algorithms by yourself. And even if you want to do so, you can still look at the source code of CImg. It is quite clearly organized and function codes are easy to find (the whole library is defined in a single header file CImg.h).
  2. I never said the whole algorithm could be done in few lines, only that it was easy to implement 🙂

So, dear C++ programmer, here is the simple prototype you need to make the foreground extraction algorithm work:

#include "CImg.h"
using namespace cimg_library;

int main() {

  // Load input color image.
  const CImg image("image.png");

  // Load input label image.
  CImg labels("labels.png");
  labels.resize(image.width(),image.height(),1,1,0);  // Be sure labels has the correct size.

  // Compute gradients.
  const float sigma = 0.002f*cimg::max(image.width(),image.height());
  const CImg blurred = image.get_blur(sigma);
  CImgList gradient = blurred.get_gradient("xy");
    // gradient[0] and gradient[1] are two CImg images which contain
    // respectively the X and Y derivatives of the blurred RGB image.

  // Compute the potential map P.
  CImg P(labels);
  cimg_forXY(P,x,y)
    P(x,y) = 1/(1 +
                cimg::sqr(gradient(0,x,y,0)) + // Rx^2
                cimg::sqr(gradient(0,x,y,1)) + // Gx^2
                cimg::sqr(gradient(0,x,y,2)) + // Bx^2
                cimg::sqr(gradient(1,x,y,0)) + // Ry^2
                cimg::sqr(gradient(1,x,y,1)) + // Gy^2
                cimg::sqr(gradient(1,x,y,2))); // By^2

  // Run the label propagation algorithm.
  labels.watershed(P,true);

  // Display the result and exit.
  (image,P,labels).display("Image - Potential - Labels");

  return 0;
}
c

To compile it, using g++ on Linux for instance, you have to type something like this in a shell (I assume you have all the necessarily headers and libraries installed):

$ g++ -o foo foo.cpp -Dcimg_use_png -lX11 -lpthread -lpng -lzsses

Now, execute it:

$ ./foo

and you get this kind of result:

cimg_ef

Once you have the resulting image of propagated labels, it is up to you to use it the way you want to split the original image into foreground/background layers or keep it as a mask associated to the color image.

The point is that even if you add the code of the important CImg methods I’ve used here, i.e. CImg<T>::get_blur(), CImg<T>::get_gradient() and CImg<T>::watershed(), you won’t probably exceed about 500 lines of C++ code. Yes, instead of a single line of G’MIC code. Now you see why I’m also a big fan of coding image processing stuffs directly in G’MIC instead of C++ ;).

A last remark before ending this addendum:

Unfortunately, the label propagation algorithm itself is hardly parallelizable. It is based on a priority queue and basically, the choice of a new label to set depends on how the latest label has been set. This is a sequential algorithm by nature. So, when processing large images, a good idea is to do the computations and preview the foreground extraction result only on a smaller preview of the original image. Then compute the full-res mask only once, after all key points have been placed by the user.

Well that’s it, happy coding!

One thought on “How to code a nice user-guided foreground extraction algorithm? (Addendum)

  1. […] also the Addendum I wrote, that describes how this can be implemented in […]