## LINQ group by and GroupBy

I initially starting using LINQ as it was easy to order the objects in a list without having to write a Comparer. Just write your lambda expression and BOOM!, list sorted.

I want to take this thought a step further, and as implied by the post title, do a group by.

Starting, here is an order by % 2 giving us a list of even and then odd numbers:

```int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };

var orderedNumbers = from n in numbers
orderby n % 2 == 0 descending
select n;

foreach (var g in orderedNumbers)
{
Console.Write("{0},", g);
}```

This is all pretty straight forward, order by numbers that when modded by 2 are 0 and we have the numbers 4,8,6,2,0,5,1,3,9,7.

But what if I want to simply have two lists, one with evens and one with odds? That’s where group by comes in.

```int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };

var numberGroups = from n in numbers
group n by n % 2 into g
select new { Remainder = g.Key, Numbers = g };

foreach (var g in numberGroups)
{
if(g.Remainder.Equals(0))
Console.WriteLine("Even Numbers:", g.Remainder);
else
Console.WriteLine("Odd Numbers:", g.Remainder);
foreach (var n in g.Numbers)
{
Console.WriteLine(n);
}
}```

with the output:

```Odd Numbers:
5
1
3
9
7
Even Numbers:
4
8
6
2
0```

What’s happening here is that LINQ is using anonymous types to create new dictionary (actually a System.Linq.Enumerable.WhereSelectEnumerableIterator<System.Linq.IGrouping<int, int>>).

It is important to note here that the key here that everything is keyed on is the first value after the “by”.

Taking this one simple step forward let’s group a bunch of words. The following doesn’t work quite right:

```string[] words = { "blueberry", "Chimpanzee", "abacus", "Banana", "apple", "cheese" };

var wordGroups = from w in words
group w by w[0] into g
select new { FirstLetter = g.Key.ToString().ToLower(), Words = g };

foreach (var g in wordGroups)
{
foreach (var w in g.Words)
{
Console.WriteLine(w);
}
}```

giving us the output:

```Words that start with the letter 'b':
blueberry
Chimpanzee
abacus
apple
Banana
cheese```

That’s because there is a bit of a red herring here. Remember that the first value after the by is what is used to group by. In our case w[0] for Chimpanzee is “C”, not c. If we change it to:

```string[] words = { "blueberry", "Chimpanzee", "abacus", "Banana", "apple", "cheese" };

var wordGroups = from w in words
group w by w[0].ToString().ToLower() into g
select new { FirstLetter = g.Key.ToString().ToLower(), Words = g };

foreach (var g in wordGroups)
{
foreach (var w in g.Words)
{
Console.WriteLine(w);
}
}```

then we get the results we expect with:

```Words that start with the letter 'b':
blueberry
Banana
Chimpanzee
cheese
abacus
apple```

Taking this even one step further we can throw an orderby above the group and order things alphabetically:

```var wordGroups = from w in words
orderby w[0].ToString().ToLower()
group w by w[0].ToString().ToLower() into g
select new { FirstLetter = g.Key.ToString().ToLower(), Words = g };```

So let’s now make this a bit over the top complex. Given the classes:

```public class Customer
{
public List<Order> Orders { get; set; }
}

public class Order
{
public DateTime Date { get; set; }
public int Total { get; set; }
}```

lets group a customer list by customer, then by year, then by month:

```List<Customer> customers = GetCustomerList();

var customerOrderGroups = from c in customers
select
new {c.CompanyName,
YearGroups = from o in c.Orders
group o by o.OrderDate.Year into yg
select
new {Year = yg.Key,
MonthGroups = from o in yg
group o by o.OrderDate.Month into mg
select new { Month = mg.Key, Orders = mg }
}
};```

Whew! that took a lot to copy and paste from MSDN’s sample library! 😉
As mentioned previously the important part here is that the keys for these are the first value after the “by”. This just creates a bunch of dictionarys keyed embeded together keyed on the values after the “by”.

The GroupBy method that is a part of Linq can also take an IEqualityComparer. Given the comparer:

```public class AnagramEqualityComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
return getCanonicalString(x) == getCanonicalString(y);
}

public int GetHashCode(string obj)
{
return getCanonicalString(obj).GetHashCode();
}

private string getCanonicalString(string word)
{
char[] wordChars = word.ToCharArray();
Array.Sort<char>(wordChars);
return new string(wordChars);
}
}```

we can find all the matching anagrams. This is possible because the IEqualityComparer compares words based on a sorted array of characters. If you take “meat” and “team” they both become “aemt” when sorted by their characters.

```string[] anagrams = { "from", "salt", "earn", "last", "near", "form" };

var orderGroups = anagrams.GroupBy(
w => w.Trim(),
a => a.ToUpper(),
new AnagramEqualityComparer()
);

foreach (var group in orderGroups)
{
Console.WriteLine("For the word "{0}" we found matches to:", group.Key);
foreach (var word in group)
{
Console.WriteLine(word);
}
}```

Like the inline Linq, here the first value is the key and the second value is what to put into the list. The last value is the IEqualityComparer I mentioned earler. We don’t get double entries since “last” will match “salt” and there is no reason, therefore, to add a new key.

That’s all for now.

Brian

Previous Post