Other Posts in Optimization

  1. Cheap Profiler for C#/ASP.Net and Cool Link
  2. Small Changes, Big Speed Ups

Small Changes, Big Speed Ups

3/4/2012

As of late I've been busy working on a new project. A side project that has arisen because I'm annoyed at the other products out there that accomplish the task (very specific industry where they charge 500 times what should be charged, they charge for each component when it should be one product, etc)... I've gone from basic design and storyboards to fleshing out most of the backend. The only thing left is connecting the front end to the API. Anyway, I've got some free time so I figured I'd post a bit about what I've been doing with my utility library and an interesting find.

One of the things that I've been focusing on is speeding up the code in my utility library. There are a number of functions that were a bit slow to say the least. For instance all of the Bitmap extensions originally were unusable on any image over 400 pixels wide, but I then switched to unsafe code which cut the amount of time it took to run each of those functions by about a factor of 10. However I've been running into people who need to use the code on images that are much, much larger than I expected. I'm talking 10240 x 10240 type images, which is insane when you consider a 10 megapixel image is about 3200x3200 (basically we're talking about a 105 megapixel image here). So I looked at my available options and looked for a way to improve the speed a bit more... And I found it in multithreading.

In .Net 4 we were given the Task Parallel Library. Prior to the TPL, in order to do anything in parallel, you had to manage threads yourself, deal with a thread pool, etc. To be honest it was a bit of a pain and was one of the reasons no one really went down that route unless they had to. But with the TPL, you can pretty much just take the following:

   1: for(int x=0;x<10;++x)
   2: {
   3:     DoSomething();
   4: }

and convert it to being multithreading by just doing the following:

   1: Parallel.For(0,10,x=>{
   2:     DoSomething();
   3: });

Basically we just send it the start and stop values, give it an Action, and we're done as the TPL takes care of the rest. This can be done with most of your for and foreach loops but there are some things to keep in mind. First, it doesn't seem to lock data from outside the Action from what I can tell. So if you try to manipulate the exact same location in an array from two of the threads that it spawns at the same time, you're going to get some bugs. On top of that not everything in the .Net world is thread safe (for instance trying to generate a random number using the Random class is a huge problem). But keeping that sort of stuff in mind, it's still pretty easy to use and works quite well.

In the case of my Bitmap extensions, I found that generally speaking I was reading from a number of pixels in one image (that wasn't changing) and modifying one pixel in the destination image (and only setting it once). As such my, for the most part, wasn't going to be effected too much by the potential pitfalls (although the random number generation did pop up). So for instance the Negative function used to be:

   1: public static Bitmap Negative(this Bitmap OriginalImage, string FileName = "")
   2: {
   3:     if (OriginalImage == null)
   4:         throw new ArgumentNullException("OriginalImage");
   5:     ImageFormat FormatUsing = FileName.GetImageFormat();
   6:     Bitmap NewBitmap = new Bitmap(OriginalImage.Width, OriginalImage.Height);
   7:     BitmapData NewData = NewBitmap.LockImage();
   8:     BitmapData OldData = OriginalImage.LockImage();
   9:     int NewPixelSize = NewData.GetPixelSize();
  10:     int OldPixelSize = OldData.GetPixelSize();
  11:     for (int x = 0; x < NewBitmap.Width; ++x)
  12:     {
  13:         for (int y = 0; y < NewBitmap.Height; ++y)
  14:         {
  15:             Color CurrentPixel = OldData.GetPixel(x, y, OldPixelSize);
  16:             Color TempValue = Color.FromArgb(255 - CurrentPixel.R, 255 - CurrentPixel.G, 255 - CurrentPixel.B);
  17:             NewData.SetPixel(x, y, TempValue, NewPixelSize);
  18:         }
  19:     }
  20:     NewBitmap.UnlockImage(NewData);
  21:     OriginalImage.UnlockImage(OldData);
  22:     if (!string.IsNullOrEmpty(FileName))
  23:         NewBitmap.Save(FileName, FormatUsing);
  24:     return NewBitmap;
  25: }

But by changing it to this:

   1: public static Bitmap Negative(this Bitmap OriginalImage, string FileName = "")
   2: {
   3:     OriginalImage.ThrowIfNull("OriginalImage");
   4:     ImageFormat FormatUsing = FileName.GetImageFormat();
   5:     Bitmap NewBitmap = new Bitmap(OriginalImage.Width, OriginalImage.Height);
   6:     BitmapData NewData = NewBitmap.LockImage();
   7:     BitmapData OldData = OriginalImage.LockImage();
   8:     int NewPixelSize = NewData.GetPixelSize();
   9:     int OldPixelSize = OldData.GetPixelSize();
  10:     int Width = NewBitmap.Width;
  11:     int Height = NewBitmap.Height;
  12:     Parallel.For(0, Width, x =>
  13:     {
  14:         for (int y = 0; y < Height; ++y)
  15:         {
  16:             Color CurrentPixel = OldData.GetPixel(x, y, OldPixelSize);
  17:             Color TempValue = Color.FromArgb(255 - CurrentPixel.R, 255 - CurrentPixel.G, 255 - CurrentPixel.B);
  18:             NewData.SetPixel(x, y, TempValue, NewPixelSize);
  19:         }
  20:     });
  21:     NewBitmap.UnlockImage(NewData);
  22:     OriginalImage.UnlockImage(OldData);
  23:     if (!string.IsNullOrEmpty(FileName))
  24:         NewBitmap.Save(FileName, FormatUsing);
  25:     return NewBitmap;
  26: }

I get the image being worked on by multiple processors, speeding up the code by about 400% (the original took about 40ms to run on my test image and about 10ms after the change, but it would depend on how many cores you have on your system, etc. as to how much of a speed up you'll see). But all I really had to do was change the outer loop and I got an instant speed boost. The next item I was going to look at was my ORM and MicroORM and this is where I ran into my little surprise...

So one of the things that I do in these cases is I set up a small app to run a number of times (usually something like 10000 times) to get my base, I go and make my modifications and go back and test to see if it made an improvement to the run time. I use my Profiler code for this. In this instance I was just curious how the SQLHelper, ORM, and MicroORM code stacked up because each one is built on top of the other (SQLHelper->MicroORM->ORM). And here is where I was surprised. I did a couple tests, inserting a single entry, updating it, selecting one entry, selecting all entries from a table, etc. I was expecting the SQLHelper to be the fastest, the MicroORM next, and ORM the slowest. And I was correct in that assessment, however it wasn't the large spread that I was expecting. For the average insert the times were 4.92ms, 4.97ms, and 5.09ms (I have an SSD, database was local, my processor is pretty freakin good, etc. plus the data was small, only a couple of strings, float, decimal, int, DateTime, long, and bool. I didn't send it anything complex), That's a .17ms difference. The select, 4.53ms, 4.60ms, 4.69ms. That's .16ms... They're all like that. As far as I can tell I've inadvertantly built an ORM that does lazy loading, deals with multiple databases, etc. that's about as fast as straight SQL. Well, almost anyway. Because there are a number of tricks you can do to speed up things when you're dealing with straight SQL.

For instance inserting those 10,000 rows took a total of 49 seconds when doing each one individually. However we can simply use the SQLBulkCopy class and have that insert the 10,000 rows in a total of 525ms... By the way, the SQLBulkCopy class is now being used by SQLHelper (the ExecuteBulkCopy function). We can also update multiple rows at once when using SQLHelper (which the ORM and MicroORM doesn't quite do yet). But there are things that I could do at the MicroORM and ORM levels to give them an advantage (caching being a big one, which I don't do yet). There are a couple of other things also, but other than that it's not a bad surprise. Anyway, take a look, leave feedback, and happy coding.



Comments