Programming Multi-core | Part 1
Until now, still very hard to write, test and debug parallel algorithms that actually take advantage of multi-core processors. Most applications used today are just a single core by design and see no speed improvements when run on a multi-core machine. I prefer to name this problem as "performance-oriented threading problem on multi-core" but term like "concurrency" is much simpler.
Concurrency is new for many developers, but there is a long history of threading problem in scientific and high-performance computing that partially solved by many approaches and can be leveraged, at least theoretically. Threading oriented API like OpenMP (Shared Memory) has been around, also MPICH (Message Passing). In this tutorial, I would like to take you in my journey with TBB and Microsoft Parallel Extension (Native and Managed), both of them adopt task parallelism. Let me start with Intel Threading Building Blocks (TBB) then move to Paralell Extension.
TBB has two versions, commercial and open source but as developer I prefer to download TBB source codes and compile it on my own machine. Ok, lets start.
Preparation:
1 - Download TBB 2.0 source codes here. Extract it to your working directory.
2 - Download TBB documentation, Getting Started Guide, Reference Manual and Tutorial.
3 - Install Visual C++ 2005/2008. For Express Edition, you will need MASM 8.0.
4 - Set environment variables by run your vsvars32.bat. It is located at for VS 2008 :
C:\Program Files\Microsoft Visual Studio 9.0\Common7\Tools
5 - TBB make file needs GNU Make Utility. I am using MinGW (mingw32-make.exe).
Copy this file to TBB working directory where make file is located.
6 - You are ready to compile, please use mingw32-make [option] command, for example:

mingw32-make : Default build. Equivalent to 'make tbb tbbmalloc'.
mingw32-make all : Equivalent to 'make tbb tbbmalloc test examples'. Compile all codes.
mingw32-make tbb : Make TBB release and debug libraries.
mingw32-make tbbmalloc : Make TBB scalable memory allocator libraries.
mingw32-make test : Compile and run unit-tests
mingw32-make examples : Build libraries and run all examples
mingw32-make clean : Clean all compiled binaries.
Note : Please consult to TBB/build/index.html document for compilation other than using CL.EXE.
I compiled TBB 2.0 in my laptop around 1 minute and I got tbb.dll and tbbmalloc.dll including all lib needed. Things that you need to check is TBB header files in includes directory and object file library (*.lib). If you want to have quick review of TBB 2.0 facilities, take a look on each header files name (and of course TBB Tutorial).
Your First TBB Program:
Now lets start to use TBB.
1 - Create an empty VC++ project.
2 - Manipulate your project property.
Set VC++ configuration property.
Linker > Additional Library Directories = path to the TBB binaries
C/C++ > General > Additional Include Directories = path to TBB include directory
3 - Write your first code as following (get the codes from TBB Getting Started guide):
Any thread that will use TBB has to initialize task_scheduler_init object. It has default constructor and destructor that will manage the lifecycle of task. To add a call to the parallel_for template function, the first parameter of the call is a blocked_range object that describes the iteration space. Blocked_range is a template class provided by the TBB library. The constructor takes three parameters:
⎯ The lower bound of the range.
⎯ The upper bound of the range.
⎯ The <grainsize>.
Modify your pre-processor code by adding the following:
#include <iostream>
#include <string>
#include <algorithm>
#include "tbb/task_scheduler_init.h"
#include "tbb/parallel_for.h"
#include "tbb/blocked_range.h"
And then add the parallel_for to your for looping.
The parallel_for subdivides the range into sub-ranges that have approximately <grainsize> elements. You can get the implementation of SubStringFinder class in TBB Getting Started guide. Ok, now you are ready to use Intel TBB other features to write your algorithm and I am sure you can do that. I will move to Microsoft initiative on "concurrency".
Parallel Extension is Microsoft Implementation to handle concurrency of managed and native codes in Windows OS. I have installed Parallel Extension for .NET Framework 3.5 Dec 2007 CTP since it first release. Its default installation directory is C:\Program Files\Microsoft Parallel Extensions Dec07 CTP where you can find one managed DLL (System.Threading.dll) there. System.Threading contains six namespaces around with native dependencies on Kernel32.dll.
Parallel Extensions to the .NET Framework has broader scope comparing to TBB. It is a managed programming model for data parallelism, task parallelism, and coordination on parallel hardware unified by a common work scheduler. Parallel Extensions makes it easier for developers to write programs that scale to take advantage of parallel hardware without having to deal with many of the complexities of today’s concurrent programming models. Parallel Extensions provide several new ways to express parallelism in our codes:
- Declarative data : LINQ-to-Objects that executes queries in parallel (PLINQ).
- Imperative data : Imperative data-oriented operations such as for and foreach loops.
- Imperative task : Express potential parallelism via expressions and statements
Hopefully I can cover those three approaches and compare it with TBB in my next post. If you want to follow me in this concurrency journey, please install the TBB and Parallel Extension directly to your OS (don't use VPC or VMWare) and start read its documentation. I have around ~4GB Channel9 videos on Parallelism that is very useful to inspire you. If you want to have the videos, just drop your address as comment in this post, I will ship its DVD for you.
Hope this helps!
Regards,
Risman Adnan