Archives for : March2009

LINQ Any and All Quantifiers with group by

Now that we’ve touched on “group by” lets throw in a couple of where clauses on the group by with any and all.

If you are really interested in all the data I have the code at the end of this post showing how I add all the products. Since this is just sample data I threw it in a list but chances are you’re going to be getting your list from some sort of DAL.

Let’s start with some lead up to “Any”.

Given:

string[] words = { "cow", "believe", "bread", "river", "receipt", "field" };
var wordsList =
    from w in words
    where w.Contains("ei")
    select w;

we can see this gives a list of words that have “ei” in them. But what if we don’t care for the list? What if we just want to know if any of the words have “ei”? We can simplify life by using the Any on a list from linq and get a boolean.

bool iAfterE = words.Any(w => w.Contains("ei"));

is true.

“No kidding”

“Oh great, folks, if you don’t know already this is Ivan.”

“So next you’re going to tell me I can use the All method on lists to tell me if all of the words contain “ei”? Come on and give your audience some credit. If you’re head wasn’t stuck so far…”

“Ivan, I get the point, I’ll make this easy for all of us and give them the code.”

“And don’t you think this whole talking to yourself think is getting old? I mean come on!”

“You think you can do better?”

“Listen to yourself. You’re talking to me who just happens to me you. What does me doing any better have to do with anything? I am you. Wow, pull yourself together and get this overwith. We have a database rewrite to review today, implement tomorrow and then a week to recode the app for the new database, time to move on.”

“Fine, here” (Note: I couldn’t do it, it looked like crap when I just put up all the code. I have to break it up a little bit so I’m ignoring Ivan because it was just a bad suggestion.)

string[] words = { "cow", "believe", "bread", "river", "receipt", "field" };
var wordsList =
    from w in words
    where w.Contains("ei")
    select w;

bool iAfterE = words.All(w => w.Contains("ei"));

As I’m sure you know, iAfterE is false as not all the words have “ei” in them.

List<Product> products = GetProductList();

var productGroups =
    from p in products
    group p by p.Category into g
    where g.Any(p => p.UnitsInStock == 0)
    select new { Category = g.Key, Products = g };

Console.WriteLine("nFor Any:");
foreach (var group in productGroups)
{
    Console.WriteLine("nFor Category {0}:", group.Category);
    foreach (var p in group.Products)
    {
        Console.Write("{0} with {1} units, ", p.ProductName, p.UnitsInStock);
    }
}

In this first product group you can see that the where on the group by is only creating groups by category where any of the products in the category have 0 units in stock. This is key to the where with group by, only if the where is satisfied will the group be created. You will end up with:

For Any:
For Category Condiments:
Aniseed Syrup with 13 units, Chef Anton’s Cajun Seasoning with 53 units, Chef Anton’s Gumbo Mix with 0 units, Grandma’s Boysenberry Spread with 120 units, Northwoods Cranberry Sauce with 6 units, Genen Shouyu with 39 units,
For Category MeatPoultry:
Mishi Kobe Niku with 29 units, Alice Mutton with 0 units, Thüringer Rostbratwurst with 0 units, Perth Pasties with 0 units, Tourtière with 21 units,
For Category DairyProducts:
Queso Cabrales with 22 units, Queso Manchego La Pastora with 86 units, Gorgonzola Telino with 0 units, Mascarpone Fabioli with 9 units,

List<Product> products = GetProductList();
var productGroups =
    from p in products
    group p by p.Category into g
    where g.All(p=> p.UnitsInStock > 0)
    select new { Category = g.Key, Products = g };

Console.WriteLine("nFor All:");
foreach (var group in productGroups)
{
    Console.WriteLine("nFor Category {0}:", group.Category);
    foreach (var p in group.Products)
    {
        Console.Write("{0} with {1} units, ", p.ProductName, p.UnitsInStock);
    }
}

For the second product group it is looking to make sure that all products in the group of a UnitsInStock of greater then 0 thus giving you:

For All:
For Category Seafood:
Ikura with 31 units, Konbu with 24 units, Carnarvon Tigers with 42 units, Nord-Ost Matjeshering with 10 units, Inlagd Sill with 112 units, Gravad lax with 11 units,
For Category Produce:
Uncle Bob’s Organic Dried Pears with 15 units, Tofu with 35 units, Rössle Sauerkraut with 26 units,
For Category Beverages:
Chai with 39 units, Chang with 17 units, Guaraná Fantástica with 20 units, Sasquatch Ale with 111 units, Steeleye Stout with 20 units,

Well, that’s a quick write up on any and all with group by. Time to start reviewing ERDs.

Brian

And here is GetProductList:

private List<Product> GetProductList()
{
    List<Product> products = new List<Product>();
    products.Add(new Product { ProductId = 10, ProductName = "Ikura", Category = "Seafood", UnitPrice = 31.0000, UnitsInStock = 31 });
    products.Add(new Product { ProductId = 13, ProductName = "Konbu", Category = "Seafood", UnitPrice = 6.0000, UnitsInStock = 24 });
    products.Add(new Product { ProductId = 18, ProductName = "Carnarvon Tigers", Category = "Seafood", UnitPrice = 62.5000, UnitsInStock = 42 });
    products.Add(new Product { ProductId = 30, ProductName = "Nord-Ost Matjeshering", Category = "Seafood", UnitPrice = 25.8900, UnitsInStock = 10 });
    products.Add(new Product { ProductId = 36, ProductName = "Inlagd Sill", Category = "Seafood", UnitPrice = 19.0000, UnitsInStock = 112 });
    products.Add(new Product { ProductId = 37, ProductName = "Gravad lax", Category = "Seafood", UnitPrice = 26.0000, UnitsInStock = 11 });
    products.Add(new Product { ProductId = 7, ProductName = "Uncle Bob's Organic Dried Pears", Category = "Produce", UnitPrice = 30.0000, UnitsInStock = 15 });
    products.Add(new Product { ProductId = 14, ProductName = "Tofu", Category = "Produce", UnitPrice = 23.2500, UnitsInStock = 35 });
    products.Add(new Product { ProductId = 28, ProductName = "Rössle Sauerkraut", Category = "Produce", UnitPrice = 45.6000, UnitsInStock = 26 });
    products.Add(new Product { ProductId = 3, ProductName = "Aniseed Syrup", Category = "Condiments", UnitPrice = 10.0000, UnitsInStock = 13 });
    products.Add(new Product { ProductId = 4, ProductName = "Chef Anton's Cajun Seasoning", Category = "Condiments", UnitPrice = 22.0000, UnitsInStock = 53 });
    products.Add(new Product { ProductId = 5, ProductName = "Chef Anton's Gumbo Mix", Category = "Condiments", UnitPrice = 21.3500, UnitsInStock = 0 });
    products.Add(new Product { ProductId = 6, ProductName = "Grandma's Boysenberry Spread", Category = "Condiments", UnitPrice = 25.0000, UnitsInStock = 120 });
    products.Add(new Product { ProductId = 8, ProductName = "Northwoods Cranberry Sauce", Category = "Condiments", UnitPrice = 40.0000, UnitsInStock = 6 });
    products.Add(new Product { ProductId = 15, ProductName = "Genen Shouyu", Category = "Condiments", UnitPrice = 15.5000, UnitsInStock = 39 });
    products.Add(new Product { ProductId = 9, ProductName = "Mishi Kobe Niku", Category = "MeatPoultry", UnitPrice = 97.0000, UnitsInStock = 29 });
    products.Add(new Product { ProductId = 17, ProductName = "Alice Mutton", Category = "MeatPoultry", UnitPrice = 39.0000, UnitsInStock = 0 });
    products.Add(new Product { ProductId = 29, ProductName = "Thüringer Rostbratwurst", Category = "MeatPoultry", UnitPrice = 123.7900, UnitsInStock = 0 });
    products.Add(new Product { ProductId = 53, ProductName = "Perth Pasties", Category = "MeatPoultry", UnitPrice = 32.8000, UnitsInStock = 0 });
    products.Add(new Product { ProductId = 54, ProductName = "Tourtière", Category = "MeatPoultry", UnitPrice = 7.4500, UnitsInStock = 21 });
    products.Add(new Product { ProductId = 11, ProductName = "Queso Cabrales", Category = "DairyProducts", UnitPrice = 21.0000, UnitsInStock = 22 });
    products.Add(new Product { ProductId = 12, ProductName = "Queso Manchego La Pastora", Category = "DairyProducts", UnitPrice = 38.0000, UnitsInStock = 86 });
    products.Add(new Product { ProductId = 31, ProductName = "Gorgonzola Telino", Category = "DairyProducts", UnitPrice = 12.5000, UnitsInStock = 0 });
    products.Add(new Product { ProductId = 32, ProductName = "Mascarpone Fabioli", Category = "DairyProducts", UnitPrice = 32.0000, UnitsInStock = 9 });
    products.Add(new Product { ProductId = 1, ProductName = "Chai", Category = "Beverages", UnitPrice = 18.0000, UnitsInStock = 39 });
    products.Add(new Product { ProductId = 2, ProductName = "Chang", Category = "Beverages", UnitPrice = 19.0000, UnitsInStock = 17 });
    products.Add(new Product { ProductId = 24, ProductName = "Guaraná Fantástica", Category = "Beverages", UnitPrice = 4.5000, UnitsInStock = 20 });
    products.Add(new Product { ProductId = 34, ProductName = "Sasquatch Ale", Category = "Beverages", UnitPrice = 14.0000, UnitsInStock = 111 });
    products.Add(new Product { ProductId = 35, ProductName = "Steeleye Stout", Category = "Beverages", UnitPrice = 18.0000, UnitsInStock = 20 });
    return products;
}

LINQ group by and GroupBy

I initially starting using LINQ as it was easy to order the objects in a list without having to write a Comparer. Just write your lambda expression and BOOM!, list sorted.

I want to take this thought a step further, and as implied by the post title, do a group by.

Starting, here is an order by % 2 giving us a list of even and then odd numbers:

int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };

var orderedNumbers = from n in numbers
                     orderby n % 2 == 0 descending
                     select n;

foreach (var g in orderedNumbers)
{
    Console.Write("{0},", g);
}

This is all pretty straight forward, order by numbers that when modded by 2 are 0 and we have the numbers 4,8,6,2,0,5,1,3,9,7.

But what if I want to simply have two lists, one with evens and one with odds? That’s where group by comes in.

int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };

var numberGroups = from n in numbers
                   group n by n % 2 into g
                   select new { Remainder = g.Key, Numbers = g };

foreach (var g in numberGroups)
{
    if(g.Remainder.Equals(0))
        Console.WriteLine("Even Numbers:", g.Remainder);
    else
        Console.WriteLine("Odd Numbers:", g.Remainder);
    foreach (var n in g.Numbers)
    {
        Console.WriteLine(n);
    }
}

with the output:

Odd Numbers:
5
1
3
9
7
Even Numbers:
4
8
6
2
0

What’s happening here is that LINQ is using anonymous types to create new dictionary (actually a System.Linq.Enumerable.WhereSelectEnumerableIterator<System.Linq.IGrouping<int, int>>).

It is important to note here that the key here that everything is keyed on is the first value after the “by”.

Taking this one simple step forward let’s group a bunch of words. The following doesn’t work quite right:

string[] words = { "blueberry", "Chimpanzee", "abacus", "Banana", "apple", "cheese" };

var wordGroups = from w in words
                 group w by w[0] into g
                 select new { FirstLetter = g.Key.ToString().ToLower(), Words = g };

foreach (var g in wordGroups)
{
    Console.WriteLine("Words that start with the letter '{0}':", g.FirstLetter);
    foreach (var w in g.Words)
    {
        Console.WriteLine(w);
    }
}

giving us the output:

Words that start with the letter 'b':
blueberry
Words that start with the letter 'c':
Chimpanzee
Words that start with the letter 'a':
abacus
apple
Words that start with the letter 'b':
Banana
Words that start with the letter 'c':
cheese

That’s because there is a bit of a red herring here. Remember that the first value after the by is what is used to group by. In our case w[0] for Chimpanzee is “C”, not c. If we change it to:

string[] words = { "blueberry", "Chimpanzee", "abacus", "Banana", "apple", "cheese" };

var wordGroups = from w in words
                 group w by w[0].ToString().ToLower() into g
                 select new { FirstLetter = g.Key.ToString().ToLower(), Words = g };

foreach (var g in wordGroups)
{
    Console.WriteLine("Words that start with the letter '{0}':", g.FirstLetter);
    foreach (var w in g.Words)
    {
        Console.WriteLine(w);
    }
}

then we get the results we expect with:

Words that start with the letter 'b':
blueberry
Banana
Words that start with the letter 'c':
Chimpanzee
cheese
Words that start with the letter 'a':
abacus
apple

Taking this even one step further we can throw an orderby above the group and order things alphabetically:

var wordGroups = from w in words
orderby w[0].ToString().ToLower()
group w by w[0].ToString().ToLower() into g
select new { FirstLetter = g.Key.ToString().ToLower(), Words = g };

So let’s now make this a bit over the top complex. Given the classes:

public class Customer
{
    public List<Order> Orders { get; set; }
}

public class Order
{
    public DateTime Date { get; set; }
    public int Total { get; set; }
}

lets group a customer list by customer, then by year, then by month:

List<Customer> customers = GetCustomerList();
 
var customerOrderGroups = from c in customers
                          select
                              new {c.CompanyName,
                                   YearGroups = from o in c.Orders
                                                group o by o.OrderDate.Year into yg
                                                select
                                                    new {Year = yg.Key,
                                                         MonthGroups = from o in yg
                                                         group o by o.OrderDate.Month into mg
                                                         select new { Month = mg.Key, Orders = mg }
                                                    }
                                  };

Whew! that took a lot to copy and paste from MSDN’s sample library! 😉
As mentioned previously the important part here is that the keys for these are the first value after the “by”. This just creates a bunch of dictionarys keyed embeded together keyed on the values after the “by”.

The GroupBy method that is a part of Linq can also take an IEqualityComparer. Given the comparer:

public class AnagramEqualityComparer : IEqualityComparer<string>
{
    public bool Equals(string x, string y)
    {
        return getCanonicalString(x) == getCanonicalString(y);
    }

    public int GetHashCode(string obj)
    {
        return getCanonicalString(obj).GetHashCode();
    }

    private string getCanonicalString(string word)
    {
        char[] wordChars = word.ToCharArray();
        Array.Sort<char>(wordChars);
        return new string(wordChars);
    }
}

we can find all the matching anagrams. This is possible because the IEqualityComparer compares words based on a sorted array of characters. If you take “meat” and “team” they both become “aemt” when sorted by their characters.

string[] anagrams = { "from", "salt", "earn", "last", "near", "form" };

var orderGroups = anagrams.GroupBy(
                      w => w.Trim(),
                      a => a.ToUpper(),
                      new AnagramEqualityComparer()
                  );

foreach (var group in orderGroups)
{
    Console.WriteLine("For the word "{0}" we found matches to:", group.Key);
    foreach (var word in group)
    {
        Console.WriteLine(word);
    }
}

Like the inline Linq, here the first value is the key and the second value is what to put into the list. The last value is the IEqualityComparer I mentioned earler. We don’t get double entries since “last” will match “salt” and there is no reason, therefore, to add a new key.

That’s all for now.

Brian