Archives for : June2014

Monitor Design Pattern with Semaphore

This continues my series on ways you’ve probably used design patterns in real-life and may not have even known it. The previous post was on the Locking, Double-Checked Locking, Lazy and Proxy Design Patterns.

In the last post we began to look at design patterns outside of the standard creational, structural and behavioral, venturing into concurrency design patterns. The concurrency design patterns are so critical because of the rise of multi-core processors and easy to use threading libraries.

In the last write-up I used the concept of a context switch to show the benefits using double-checked locking. The problem with this concept is that there may not be a context switch but may actually be two threads that are working in the same piece of code at the same time. Context switching was necessary when we had single core processors that gave the appearance of doing multi-threading but everything was still operating sequentially. Of course we still have context switching. We usually only have 4 or 8 cores so to support hundreds of threads we’ll still need context switching.

But now we run into the very real problem of multiple threads attempting to access what may be limited resources at the same time. As we saw in the last post we can lock around specific code to limit the access to the code across threads. The monitor design pattern consists of an object, the monitor, that handles controlling when threads can access code blocks. Locking in .NET is the the easiest form of the monitor design pattern in that lock(lockobj) ends up just being syntactic sugar wrapping a Monitor object. In being the easiest form, however, it is also the simplest in that it doesn’t provide a lot of flexibility.

There are a lot of different types of monitors as shown in the reference above but I wanted to talk about one specific type, the Semaphore. The premise of the Semaphore is pretty straight forward, limit to a specific number the amount of threads that have access to a block of code. It’s like lock but you can control the number of threads. In .NET we have two different types of semaphores, Semaphore and SemaphoreSlim. I suspect, in general, you will want a SemaphoreSlim as it is lightweight and fast. But it’s intended to be used only where wait times for the resources will be short. But how short is short? Well, that’s something you will have to investigate on your own as there is no guidance from Microsoft other than “short”. For more information on the differences between Semaphore and SemaphoreSlim please read the Microsoft article addressing it.

Using a SemaphoreSlim is incredibly easy. Here is a sample I’ve incorporated into my TPLSamples.

SemaphoreSlim pool = new SemaphoreSlim(2);  //limit the access to just two threads
List<Task> tasks = new List<Task>();
for (int i = 0; i < 5; i++)
{
	Task t = Task.Run(() =>
	{
		UpdateLog("Starting thread " + Thread.CurrentThread.ManagedThreadId);
		pool.Wait();  //wait until we can get access to the resources
		string result = new WebClient().DownloadString("http://msdn.microsoft.com");
		UpdateLog("Thread " + Thread.CurrentThread.ManagedThreadId + ": Length = " + result.Length);
		pool.Release();  //and finally release the semaphore
	});
	tasks.Add(t);
}

Task.WaitAll(tasks.ToArray());

This results in the output:

Starting Semaphore Slim Sample
Starting thread 9
Starting thread 10
Starting thread 14
Starting thread 15
Starting thread 13
Thread 10: Length = 25146
Thread 9: Length = 25152
Thread 14: Length = 25152
Thread 15: Length = 25146
Thread 13: Length = 25146
Completed Semaphore Slim Sample
Semaphore Slim Sample ran in 00:00:02.0143689

It’s not obvious by this output that we’re limiting the download to just two threads at a time but we are. Now we may not be able to use such a contrived example and may need to actually use this in production code. In that case you’ll want to do this asyncronously with async/await.

SemaphoreSlim pool = new SemaphoreSlim(2); //limit the access to just two threads
List<Task> tasks = new List<Task>();
for (int i = 0; i < 5; i++)
{
	Task t = Task.Run(async () =>
	{
		UpdateLog("Starting thread " + Thread.CurrentThread.ManagedThreadId);
		await pool.WaitAsync();  //wait until we can get access to the resources
		string result = await new HttpClient().GetStringAsync("http://msdn.microsoft.com");
		UpdateLog("Thread " + Thread.CurrentThread.ManagedThreadId + ": Length = " + result.Length);
		pool.Release();  //and finally release the semaphore
	});
	tasks.Add(t);
}

Task.WaitAll(tasks.ToArray());

This results in the output:

Starting Semaphore Slim Async Sample
Starting thread 9
Starting thread 15
Starting thread 14
Starting thread 13
Starting thread 17
Thread 19: Length = 25143
Thread 19: Length = 25149
Thread 21: Length = 25149
Thread 21: Length = 25143
Thread 21: Length = 25143
Completed Semaphore Slim Async Sample
Semaphore Slim Async Sample ran in 00:00:02.1258904

Wait a minute! How can the thread ids when we have the web page be different than what the thread started? The framework tries to help you out by maintaining a SynchronizationContext. Covering the SyncronizationContext is outside of the scope of a article on using Semaphore but I would highly recommend reading the article It’s All About the SynchronizationContext where they cover the problems with using Task.Run and losing the current SyncronizationContext. See, if we have a SyncronizationContext then awaiting isn’t a big deal, the framework will use the current SyncronizationContext. When we use Task.Run to spin off a new task that current SyncronizationContext is lost and we no longer return to the same thread when we use await.

For us, in using a SemaphoreSlim, we don’t care. Losing the SyncronizationContext can be a big deal, however, especially in UI where you need the main thread.

So that’s really it, using a SemaphoreSlim to demonstrate the Monitor design pattern. I’ve updated my TPL Sampler to include these two samples.

Thanks for reading,
Brian

This continues my series on ways you’ve probably used design patterns in real-life and may not have even known it. The previous post was on the Iterator Design Pattern.

The original book on software design patterns, Design Patterns: Elements of Reusable Object-Oriented Software discussed three types of design patterns, creational, structural and behavioral. As time has continued we’ve found a need for further patterns to cover other targeted areas. One of those areas is concurrency patterns that specifically addresses patterns for multi-threaded software development.

With the ease of the TPL and async/await integrated into .NET, multi-threaded software development should be the standard. Microsoft has gone to great pains to make multi-threaded development ridiculously easy. In fact it actually causes some hesitancy when I consider doing development in any other languages. I had been hoping when Swift was released that Apple had the forethought to improve the horrendous multi-threading model in Objective-C but they did not. In defence of Objective-C however, it’s not any worse than most other languages.

This brings us to the double-checked locking design pattern that is a part of the concurrency patterns. Consider the following code:

public class Singleton
{
	private Singleton() { /* Some long task to initialize */ }

	private static Singleton _instance = null;
	public static Singleton Instance
	{
		get
		{
			if (_instance == null)
				_instance = new Singleton();
			return _instance;
		}
	}
}

The problem with this code is that you run the potential for multiple instances of Singleton to be running around when you only wanted one. Here is the scenario:

Thread 1, line 10: Since this is the first access _instance is null.
CONTEXT SWITCH
Thread 2, line 10: Hmm, _instance is null.
CONTEXT SWITCH
Thread 1, line 11: Well, since _instance is null we have to create a new instance.
Thread 1, line 12: return the _instance we just created.
CONTEXT SWITCH
Thread 2, line 11: Well, since _instance is null we have to create a new instance.
Thread 2, line 12: return the _instance we just created.

Here is a perfect scenario to utilize locking. Locking is a pattern that allows a thread exclusive access to a segment of code without having to worry about other threads entering the block of code.

public class Singleton
{
	private Singleton() { /* Some long task to initialize */ }

	private static object lockObj = new object();
	private static Singleton _instance = null;
	public static Singleton Instance
	{
		get
		{
			lock (lockObj)
			{
				if (_instance == null)
					_instance = new Singleton();
			}
			return _instance;
		}
	}
}

We can see that the lock will block the constructor until the current thread is done. Here is the scenario:
Thread 1, line 11: No ones here so let’s go into the lock.
Thread 1, line 13: Since this is the first access _instance is null.
CONTEXT SWITCH
Thread 2, line 11: Hmm, some other thread is here already, I’ll have to wait.
CONTEXT SWITCH
Thread 1, line 14: Well, since _instance is null we have to create a new instance.
Thread 1, line 15: Done with the lock.
CONTEXT SWITCH
Thread 2, line 11: Whoever was here is done, let’s get going.
Thread 2, line 13: _instance has a value.
CONTEXT SWITCH
Thread 1, line 16: return _instance.
CONTEXT SWITCH
Thread 2, line 16: return _instance.

Now there are some problems with locking, most notably dead locking. This occurs when there is a lock on thread 1 but thread 2 is waiting for thread 1 to finish. Neither can continue and your application locks up. Additionally, by utilizing locks you turn a small piece of code into a bottleneck and eliminate a lot of the benefits of multi-threading. If threads can only enter the lock one at a time to see if _instance is null, what are we really gaining here?

This is where the double-checked locking pattern comes in handy. Why go into the lock if you don’t need to?

public class Singleton
{
	private Singleton() { /* Some long task to initialize */ }

	private static object lockObj = new object();
	private static Singleton _instance = null;
	public static Singleton Instance
	{
		get
		{
			if (_instance == null)  //first check
			{
				lock (lockObj)
				{
					if (_instance == null)  //second check
						_instance = new Singleton();
				}
			}
			return _instance;
		}
	}
}

We just check to see if the value is null and if it isn’t we don’t even have to consider locking. Here is the scenario:
Thread 1, line 11: Since this is the first access _instance is null.
Thread 1, line 13: No ones here so let’s go into the lock.
CONTEXT SWITCH
Thread 2, line 11: Hmm, _instance is null.
Thread 2, line 13: Hmm, some other thread is here already, I’ll have to wait.
CONTEXT SWITCH
Thread 1, line 16: Well, since _instance is null we have to create a new instance.
Thread 1, line 17: Done with the lock.
CONTEXT SWITCH
Thread 3, line 11: _instance has a value.
Thread 3, line 19: return _instance.
CONTEXT SWITCH
Thread 2, line 11: Whoever was here is done, let’s get going.
Thread 2, line 15: _instance has a value.
CONTEXT SWITCH
Thread 1, line 19: return _instance.
CONTEXT SWITCH
Thread 2, line 19: return _instance.

This scenario presents the best of both worlds (which is why it’s a pattern), it allows you to safely construct your singleton while still minimizing locking. But as I’m sure you know, in .NET we can do better. As I mentioned earlier Microsoft has gone to some pretty great pains to make multi-threading easier for us. One of those ways is by using Lazy<T>.

public class Singleton
{
	private Singleton() { /* Some long task to initialize */ }
	private static readonly Lazy<Singleton> _lazyInstance = new Lazy<Singleton>(() => new Singleton());
	public static Singleton Instance
	{
		get { return _lazyInstance.Value; }
	}
}

All this is doing is taking out the need for double-checked locking by doing it for you. That’s it. Now, this is the most basic use of Lazy<T>. If you read the reference you have a lot of control in terms of what happens when exceptions are thrown in the constructor as well as other aspects so if you take fully take advantage of it you’ll see there is a lot more it can do.

And all that was done to get you here. Lazy initialization is part of the creational patterns from GoF. But it’s implementation is done using the Proxy design pattern. Lazy<T> uses the proxy design pattern for controlling access to the value of the instance. At it’s most generalized definition, the proxy design pattern “is a class functioning as an interface to something else.” It’s definition is incredibly broad. In addition to Lazy, Facade, Flyweight and Adapter patterns could all also be called types of proxy patterns, though that is only in the loosest sense of the definition. Generally proxy is considered applicable where there is a heavy expense or long load time to initializing or constructing an object.

Imagine if you are working with images. Most of the time you are looking at the meta-data about the image, size, date created, etc… This is so you can sort them, organize them, move them around. Sure, occasionally you want to open the image but this is an expensive task that could eat a lot of memory. To handle this scenario you create a “meta-image” object (or proxy). This provides for all the information about the image without having to load it into memory. Then, when the code needs the image, the “meta-image” object could provide for the ability to retrieve the real image and thus you have your proxy design pattern.

Thanks for reading,
Brian

Is Agile that good or that bad?

I don’t often write opinion pieces and in fact, I’ll try and keep the opinion in this post to a minimum. I don’t intend to link bait, I think this is my way of rubber duck debugging, of gathering my thoughts, getting my observances out and hopefully hearing from the community at large to voice your opinions.

I need to start by saying I’ve never worked in an Agile shop. I’ve never seen it in practice, never seen how truly effective it can be (if it can be). To me Agile feels like how MVVM did before I started working with it. Though I had understood the principles of MVVM, it felt bloated and impractical, too much process in the practice. Eventually, though, as you start trudging through your first application the blinds get pulled back a little bit and you start to see some of the benefits, everything starts to wire up so easily. Adding new features, fixing bugs is easier, unit tests hook in easier, solid n-tiered architecture.

In the end you end up with virtually all your code being watched over by unit tests because they just work on your view models, and you don’t care about code behind causing problems. But I don’t think until I had put MVVM into practice did I really understand the benefit. It’s easy to dismiss, easy to point out that there is a significant overhead initially in developing an MVVM based application (which is why MVVM should not be used for simple projects). You have to have management that understands the long-term benefits in the maintenance tail and this doesn’t always happen. But Agile isn’t an architecture for software development, it’s a set of development practices.

Iterative Waterfall

iterfallI’ve always used iterative waterfall (aka iterfall) as the basis for my development. Start with as much of a thorough understanding of requirements as possible, make sure they’re all written down and agreed upon. Move on to design. Do your ERDs and XSDs and class diagrams and wire frames. Take those designs to the client. The wire frames are especially important for the client. Once they understand what the application is going to look like and how it’s intended to work there will be requirements changes. No biggie, go back to requirements, fix the problems, modify the design documents and then get approval again. Once design is done start implementing the application. You may run into a problem with the design, so go back to design, if it’s going to be a problem there then go back to requirements. Does it need to be a requirement? If yes, then move back down and, though it may be painful, fix it in design and then get approval from the customer. Then fix it in implementation.

The key to iterative waterfall is to never being afraid to move back a step. Unlike a traditional waterfall, never feel like you’re locked into a step. Just because you’ve got sign-off on requirements doesn’t mean they won’t change. And that’s okay. The other thing I like about iterative waterfall is that as you’re sitting down to do your project plan and laying out how the application is going to happen, each task becomes it’s own iterative waterfall. Then when you get into the maintenance tail the iterative waterfall starts all over again.

This is still, however, a “Big Design Up Front” approach and is distinctive from the iterative and incremental development practices of some Agile process because rather then working each cycle completely, in iterative waterfall you move within a path forwards or backwards.  So is this bad?

Agile Practices

There are some aspects of the Agile process I have seen as beneficial and attempted to incorporate it within my process when I worked as a project lead. One of the best, with a major caveat, is pair programming. That caveat is that when attempting to do pair programming you have to be very selective in terms of who you pair up together and it’s not for everyone. Where it made sense I attempted to incorporate pair programming, working with another software engineer side-by-side. We would discuss our thoughts on what we were implementing, potential issues and conflicts with other aspects of the application. It would be a quality collaboration that produced better code.

But I couldn’t work with some engineers like this. One engineer would just shut down. He would mindlessly type at the keyboard whatever I was saying. When I was at the keyboard it was like trying to pull teeth to get him to contribute. It worked better to step away from the computer and have more of a casual conversation regarding what he was working on and then to let him go forward. Then there are those engineers who fit the stereotype for software engineers. They tend to be introverted. They don’t talk a lot. It’s incredibly hard to get them to contribute at meetings. But let them sit alone in front of a computer with an idea of what you need from them and you’ll get 12 hours worth of work in 2. All I’d often get from software engineers like this is uncomfortable squirming. They don’t want you there, they don’t want you on their computer, they don’t want to be on your computer. They just want to be left alone to work. They have things set up just right for their environment and that’s how they like it. So pair programming works, just in a limited scope with limited participants. This doesn’t make it bad, it just needs to be applied within the scope of the engineers assigned to a project.

Now I do have to say I’m not a fan of the “Daily Scrum” idea. What I found most effective is for me to meet with each of the engineers (or if I was in a larger project I imagine with each team lead) and see where they were at. I had a time, 9 AM, because by then everybody would be in, where I would walk around to each engineers, ask them what they were working on, how things were going and any problems whey were having. Now the largest team I’ve ever lead had only 4 engineers other then myself. So maybe it’s just that where simple projects don’t benefit from MVVM, maybe I’ve never worked on a project large enough that it would have benefited from Agile.

Continuous integration is another aspect of Agile I think is necessary to any development methodology. With so many tools out there it should be just the norm. We use CrouseControl at work but Scott Hanselman just did a great article on AppVeyor for continuous integration.

Use case and user stories are fundamental to the design of an application. They’re listed as an “Agile practice” oddly enough but I always thought they were crucial to the design phase of a waterfall.

I like the idea of Sprints post-implementation. When the user’s start using the application and you start getting bug reports and feature requests in it makes sense to go with a more rapid release cycle until things settle down and then move out to a quarterly release cycle (except for high priority bugs). But does this make sense during initial application development. Maybe it’s my own ignorance. Maybe I don’t understand what a Sprint is. So here is the definition from the creators:

The Sprint

The heart of Scrum is a Sprint, a time-box of one month or less during which a “Done”, useable,
and potentially releasable product Increment is created. Sprints best have consistent durations
throughout a development effort. A new Sprint starts immediately after the conclusion of the
previous Sprint.
The Scrum Guide™
The Definitive Guide to Scrum:
The Rules of the Game


Ken Schwaber and Jeff Sutherland

If you read the above reference you get a bigger understanding of the picture. It just means within the timeframe of the sprint you have to have your code ready for production. It doesn’t mean that the application as a whole needs to be ready but that the code you are responsible is ready for production. And there is a lot more to Scrum. (But isn’t this really just milestones?)

Agile Methodologies

But is Scrum Agile? Well, I mean, it’s a part of Agile, right? Ken Schwaber and Jeff Sutherland were two people who signed the Agile Manifesto. The Agile Manifesto reads, in its entirety, as follows:

We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:
Individuals and interactions over Processes and tools
Working software over Comprehensive documentation
Customer collaboration over Contract negotiation
Responding to change over Following a plan
That is, while there is value in the items on the right, we value the items on the left more.

Kent Beck James Grenning Robert C. Martin
Mike Beedle Jim Highsmith Steve Mellor
Arie van Bennekum Andrew Hunt Ken Schwaber
Alistair Cockburn Ron Jeffries Jeff Sutherland
Ward Cunningham Jon Kern Dave Thomas
Martin Fowler Brian Marick

© 2001, the above authors. This declaration may be freely copied in any form, but only in its entirety through this notice.

These sound like noble goals. These are all things we want. (And hey, Uncle Bob’s in there! 🙂 So what is Agile? Well, it seems to be a bunch of software development methodologies and practices that people wanted to lump together and call Agile. The best I can get out of all the websites and blogs and wiki pages on Agile, the only thing that seems to tie them together is that their not “classic” waterfall. One of the points that seem to be made about the differences is the Agile moves testing to a different place then classic waterfall. In classic waterfall you don’t test anything until you are all done. Test Driven Development (TDD) moves testing to the very front of development. As a part of iterative development, having continuous cycles means that at each cycle you are testing. But this doesn’t apply when looking at the iterative waterfall. As I discussed above, each task during the implementation of the iterative waterfall is itself an iterative waterfall. This makes it sound a lot like iterative development but the process itself is fundamentally different. You have to ask yourself, do I have the requirements to implement this? Are the designs done? Do I need to modify them? How do I implement this? Now lets test my code (i.e. write a unit test). And now we’re in the maintenance phase of this task. Iterative waterfall is an incredibly recursive process.

So if I utilize the Agile practices without utilizing the Agile methodologies am I doing Agile development? A lot of the Agile practices are really just good development practices that happen to be parallel to the purpose of Agile. Why does that make them Agile practices? So with the 10 or so listed Agile methodologies, which ones produce the best output? The Agile methodologies themselves seem so different. I mean, if I were to spin up a new project would I use Extreme Programming or Scrum? Are there aspects of each of them that can be co-mingled? Why would I choose one over the other? And then there’s method tailoring which seems to be about adapting the methodology and practices to suit your needs. But there is no guidance in terms of what situations work better for what methodologies.

But Agile is so Damn Profitable

There are a lot of books about how each of the Agile methodologies is better than classic waterfall but there doesn’t seem to be anything on how and where one methodology would be better than another. I see a lot of “Buy this Agile book” and “Hire me as your Agile consultant” and “Pay a bunch of money to become a Professional Scrum Master™” (scrum.org and by extension EBMgt are really good at this). Believe it or not I’m actually okay with this. If there is truly value to be gained then it makes sense to pay for it. Agile (writers, teachers, consultants) seems to have truly grasped the concept of capitalism and run with it. But more to my point, I have yet to have someone from the Agile community provode some sort of guide on where and how the Agile methodologies are better or worse. I’ve read a lot of blogs and web sites touting Agile from people much smarter than me.

Wonderful, I’m sold. So should I go with Adaptive Software Development or Agile Modeling or Agile Unified Process or Crystal Methods (Crystal Clear) or Disciplined Agile Delivery or Dynamic Systems Development Method or Extreme Programming or Feature Driven Development or Lean software development or Kanban or Scrum or Scrum-ban.

And what if I’m not sold? The Editor in Chief of drdobbs.com, Andrew Binstock, had a great article titled The Corruption of Agile. It seems to me that one of the key issues with Agile is understanding in what scenarios to apply the right methodologies and practices and then integrate method tailoring to your specific needs. To paraphrase I read the article as, “People have so integrated Agile practices into their culture that they no longer apply method tailoring.” That’s just my interpretation. This all concludes in Andrews response to the responses title, Addressing the Corruption of Agile, which links to responses from Rob Myers of the Agile Institute and Uncle Bob.

So there are a lot of people that love Agile. And a lot of people that hate it. It seems absurd to me that this should be such a polarizing issue. And I sit right in the middle. Is Agile that good or that bad? It troubles me that, other than hating on a classic waterfall (which makes sense to me), the various methodologies that make up Agile don’t defend why and when they are better than the other Agile methodologies. Every method has it’s good points and bad points and shouldn’t be universally applied. Even the iterative waterfall I mentioned above. It fails on really large applications when resources at immediately available. When we’re still defining requirements but we have engineers available to begin development, well, there’s iterative waterfall’s downfall. You end up with idle resources. The easy way around that is to start with a smaller team and do frequent iterations between requirements and design.

Anyways this was just my place to get out some of my thoughts on Agile. I freely admit that I’m ignorant in Agile. I’m just not sure how to move forward so I won’t be. I want that forward movement to be meaningful.

Thanks for reading,
Brian

This continues my series on ways you’ve probably used design patterns in real-life and may not have even known it. The previous post was on the Adapter Design Pattern.
This is a kind of “catch-all” post where I want to talk not only about the Iterator Design Pattern but also custom enumerators for Parallel.ForEach and ensuring you give your threads enough work.

The iterator pattern is a way to move through a group of objects without having to understand the internals of the container of those objects. Anything in .NET that implements IEnumerable or IEnumerable<T> provides an iterator to move over the values. List<T> and Dictionary<TKey, TValue> are good examples.

If we look at my TPL sampler in my GreyScaleParallelSample we have the following code:

System.Drawing.Imaging.BitmapData bmData = bmp.LockBits(new System.Drawing.Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadWrite, System.Drawing.Imaging.PixelFormat.Format24bppRgb);
int stride = bmData.Stride;
unsafe
{
	byte* start = (byte*)(void*)bmData.Scan0;

	int height = bmp.Height;
	int width = bmp.Width;

	Parallel.For(0, height, y =>
	{
		byte* p = start + (y * stride);
		for (int x = 0; x < width; ++x)
		{
			byte blue = p[0];
			byte green = p[1];
			byte red = p[2];

			p[0] = p[1] = p[2] = (byte)(.299 * red
				+ .587 * green
				+ .114 * blue);

			p += 3;
		}
	});
}
bmp.UnlockBits(bmData);

This code is very similar to code I used in some image manipulation I had to implement. Here, however, all we’re doing is setting each pixel to grey scale (I’m not sure why but for some reason I use the British spelling of grey). If we look at it we’re iterating over the height and then by the width. But an image is really just a byte array where every three places identifies the blue, green and red bytes for a given pixel. We don’t need to treat it like a map with height and width.

Now to do this we’ll need a custom iterator (see? I brought it back to the purpose of this post 🙂 Fortunately Parallel.ForEach allows you to define an IEnumerable so that you can customize how it iterates over the values. We can just set up a simple for loop and yield on each value.

public static IEnumerable<int> ByVariable(int max, int increment)
{
	for (int i = 0; i < max; i+= increment)
		yield return i;
}

What this does is allow you to iteratate over a Parallel.ForEach by some amount up to some supplied maximum. I’ve added a new sample to my TPLSampler called GreyScaleBySingleParallelSample that uses this.

System.Drawing.Imaging.BitmapData bmData = bmp.LockBits(new System.Drawing.Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadWrite, System.Drawing.Imaging.PixelFormat.Format24bppRgb);
int stride = bmData.Stride;
System.IntPtr Scan0 = bmData.Scan0;
unsafe
{
	byte* start = (byte*)(void*)Scan0;

	Parallel.ForEach(ByVariable(bmp.Height * bmp.Width * 3, 3), i =>
	{
		byte* p = (start + i);
		byte blue = p[0];
		byte green = p[1];
		byte red = p[2];

		p[0] = p[1] = p[2] = (byte)(.299 * red
					+ .587 * green
					+ .114 * blue);
	});
}
bmp.UnlockBits(bmData);

The max value of ByVariable is the height of the image by the width times 3 (since each byte represents one color of the three that make up a pixel) and the amount to increment is by 3. This way we can move through the byte array 3 bytes (or 1 pixel) at a time.

So this is awesome, right? We’ll spin off a bunch of threads and this will crank through a big image in no time. So let’s run this against an 8 MB image and compare it to the first method.

Reseting Image
Starting Grey Scale Parallel Sample
Completed Grey Scale Parallel Sample
Grey Scale Parallel Sample ran in 00:00:00.1700515

Reseting Image
Starting Grey Scale By Single Parallel Sample
Completed Grey Scale By Single Parallel Sample
Grey Scale By Single Parallel Sample ran in 00:00:01.5654025

Wait, what? This second method runs significantly slower (and “Resetting” is spelled wrong). As I’ve mentioned in the past, when you can’t give your threads enough work such that you overcome the cost of having to spin up and/or set up the thread you just end up wasting time. If you’ve read my past posts on this, I know I may seem like I keep harping on this but it is important. I’ve seen quite a few cases where people think that the solution to a problem with a long running process is just to throw more threads at it. It may very well be that is a solution but you need to understand what your code is doing. It doesn’t make sense when optimizing code to just throw everything against a wall and see what sticks.

That being said, there are times where using the “ByVariable” enumerable is helpful. There is an interface I interact with that returns a string array where the values are grouped by (value, unit, error). I have to do a bunch of handling and work on the values that returned in the array. In this use it makes sense.

So what have we covered?

  1. What the Iterator Design Pattern is.
  2. It’s implementation in .NET.
  3. How to use a custom iterator in a Parallel.ForEach.
  4. Making sure to give each thread in a Parallel.For/Each enough work.

Thanks,
Brian