I was working on a small algorithm and it took a while to do the complete processing so I thought of using POSIX threads for multithreading where I failed horribly. I spent some good amount of time on it, but realized maybe it needed a bit more. I knew that OpenCV has TBB support. I started to look for small examples which would help me learn how to use OpenCV’s TBB API. Examples were really hard to find, but thankfully, I found this post on OpenCV Forum - How To Use parallel_for.
What is TBB
Threading Building Blocks (TBB) is a C++ template library developed by Intel for writing software programs that take advantage of multi-core processors. OpenCV has a provided a simple API to take advantage of TBB.
OpenCV with TBB
OpenCV lets you use the functionality of TBB with its native datatypes without much hassle. OpenCV has provided
cv::parallel_for_ function which helps us use TBB functionality. So here’s how you use loops with multi-core processors using OpenCV and TBB.
I have made a sloppy example of Gaussian Blur using TBB.
When you profile this implementation against normal implementation (without TBB), you might find that sometimes TBB implementation is considerably slower. This happens due to a lot of function calls. So if you are doing some sort of heavy processing, you might consider using TBB. But if you’re just performing a small task, e.g.
accumulateWeighted function of OpenCV, TBB implementation turns out to be 3-4 times slower over 1000 iterations.
P.S. Building an android application which will be launched soon on Android Play store.