The Table.Group function in Power Query allows you to summarize your data based on specific columns. However, many users are unaware of the fifth argument that lets you customize how groups are formed. This article shows you how to use the Comparer Function in Table.Group. You’ll also learn how to create a custom comparer function for more precise data grouping.
Table of contents
Using Comparer Functions in Table.Group
The easiest way to use the Table.Group‘s Comparer argument is by using a comparer function from Power Query’s standard library. Here are the common ones:
- Comparer.Ordinal: This performs a case-sensitive comparison using ordinal rules.
- Comparer.OrdinalIgnoreCase: This ignores capitalization when comparing.
- Comparer.FromCulture: This respects local formatting rules during comparison.
We will work with the following dataset to demonstrate case sensitivity with the Table.Group function:
Default Case-Sensitive Grouping
The Table.Group function by default does a case-sensitive comparison using Comparer.Ordinal. Here’s an example of what this looks like:
Table.Group(
Source,
{"Location Id"},
{{"Row Count", each Table.RowCount(_), Int64.Type}},
GroupKind.Global,
Comparer.Ordinal
)
This code groups the data by the “Location Id” column in a case-sensitive manner, as illustrated below:
Due to the bad quality of the data, the names that look similar can’t be grouped together.
Tip
While you could use the Table.FuzzyGroup function to address this, we will focus on how to solve it with Table.Group.
Case-Insensitive Grouping
In case you want to group items together, regardless of their capitalization, you can add another Comparer Function to the fifth argument of Table.Group. The function you need is Comparer.OrdinalIgnoreCase. Here’s how to use it:
Table.Group(
Source,
{"Location Id"},
{{"Row Count", each Table.RowCount(_), Int64.Type},
{"Rows", each _, Table.Type }},
GroupKind.Global,
Comparer.OrdinalIgnoreCase
)
Notice the additional line that creates a “Rows” column. The outcome of this expression is:
The grouping now includes a table object, allowing you to inspect the summarized rows. This makes it easier to inspect the summarized data. By using this method, the operation groups names without considering capitalization, which results in fewer grouped rows.
Now the standard library comparer functions are handy when you only need to test for case sensitivity or ignore cases. But what if you need to handle more complex conditions? That’s where custom comparers come in.
Working with Custom Comparers
Table.Group supports the use of custom comparers. This lets you to define your own logic for grouping values. Before we dive into custom comparers, let’s understand the basic requirements for a valid comparer.
In the M language, there are several functions for comparing values. These functions determine if one value comes before, after, or is identical to another value. They return -1 if the first value comes before the second, 0 if they are equal, and 1 if the first value comes after the second.
In other words, the comparison logic must be symmetric. If value1 is determined to come after value2, then when the values are switched, value1 should come before value2. This symmetry makes sure your grouping is consistent and accurate.
Let’s look at a few examples to understand how comparisons work:
// For case insensitive comparisons
Comparer.OrdinalIgnoreCase( "Ape", "Bear" ) // Output -1
Comparer.OrdinalIgnoreCase( "Duck", "duck" ) // Output 0
Comparer.OrdinalIgnoreCase( "Bear", "Ape" ) // Output 1
// For case sensitive comparisons
Comparer.Ordinal( "Duck", "duck" ) // Output -1
Comparer.Ordinal( "duck", "duck" ) // Output 0
Comparer.Ordinal( "duck", "Duck" ) // Output 1
// Comparing Values
Value.Compare( 60, 10 ) // Output 1
Value.Compare( 60, 60 ) // Output 0
Value.Compare( 10, 60 ) // Output -1
This illustrates that a valid comparer function returns -1, 0, or 1. Now, how can we incorporate this into the grouping operation?
Single Column Logic
Case-Insensitive Comparison
Remember our previous example with Comparer.OrdinalIgnoreCase for case-insensitive comparisons? You can create a similar logic by building your own comparer function.
Here’s one way to do it:
Table.Group (
Source,
{ "Name" },
{ { "Row Count", each Table.RowCount ( _ ), Int64.Type },
{ "Rows", each _, Table.Type } },
GroupKind.Global,
( x, y ) => Value.Compare ( Text.Lower ( x[Name] ), Text.Lower ( y[Name] ) )
)
This code uses the Value.Compare function to compare values. It converts them to lowercase with Text.Lower first. This way, names are grouped together no matter how they are capitalized.
Group Data by Start of Quarter
LLet’s look at another example where we build custom comparers based on a single column and combine it with GroupKind.Global.
Imagine we have a dataset with the start dates of each month from January 1, 2000, to March 2001:
Our goal is to group these dates by each quarter of the year. This means we want to group January, February, and March for Q1 2000, April, May, and June for Q2 2000, and so on. To achieve this using a custom comparer function, we will use the Value.Compare function as follows:
Table.Group(
myTable,
{"Date"},
{{"Details", each _, type table [Date = date]},
{"NumOfRows", each Table.RowCount(_)}},
GroupKind.Global,
(x, y) =>
Value.Compare( Date.StartOfQuarter( x[Date] ), Date.StartOfQuarter( y[Date] ) )
)
This expression uses GroupKind.Global. This settings groups all values in the dataset without considering their order. Since the sorting order isn’t guaranteed, the custom comparer function should compare both the current value (x
) and the next value (y
) to ensure symmetry. By comparing the Date.StartOfQuarter value for all dates in the table and then grouping the data, you get the following result:
In this result, the data is grouped by the “Date” column, showing the first start of the quarter value within each group. Each group contains three values representing the months of that specific quarter of the year. Since the order of evaluation doesn’t matter with this logic, the result is accurate.
Multiple Columns Logic
In another scenario, we have a dataset with two text columns. For the summarizing of the columns we have the following requirements:
- Period: Group this column in a case-sensitive manner.
- Product: Ignore capitalization when grouping this column.
Here’s what the dataset looks like:
It’s important to remember that our custom comparer function should be symmetric. This means that the comparison should work correctly regardless of the order of the values. To achieve this, we can use multiple comparer functions as follows:
Table.Group (
Source,
{ "Period", "Product" },
{ { "Details", each _, type table },
{ "NumOfRows", each Table.RowCount ( _ ) } },
GroupKind.Global,
( x, y ) =>
let
IgnCase = Comparer.OrdinalIgnoreCase ( x[Product], y[Product] ),
RespCase = Comparer.Ordinal ( x[Period], y[Period] ),
Result = if IgnCase = 0 then RespCase else IgnCase
in
Result
)
The result is the following table:
The custom comparer in the fifth argument performs two main operations to group values effectively.
- Step 1: Ignoring Case Differences: The first operation, stored in the
IgnCase
variable, checks for similarities in the Product column while ignoring capitalization differences. This means entries like “Shirt” and “SHIRT” are considered equal. - Step 2: Case-Sensitive Comparison: If the values are similar, the comparer then uses the
RespCase
variable to compare the ‘Period’ column in a case-sensitive way, using Comparer.Ordinal. If this comparison also finds the values equal, they are grouped together.
You’ve now seen a few examples that use custom comparer functions. These custom comparers need strict symmetry in their logic. However, sometimes you might use of GroupKind.Local instead of GroupKind.Global, which respects the order of your data. In those cases, you can use a simpler custom comparer setup. Let’s find out how.
Simplified Custom Comparers with GroupKind.Local
When you want your grouping operation to respect the order of your data, you can use the GroupKind.Local enumeration. Let’s explore how our logic changes in this scenario.
Suppose you’re working with a table where some data is missing. The input comes from a pivot table and when importing it into Power Query, the missing rows are filled with null values:
We want Power Query to create groups based on the Location Id
column.
Tip
You can also fill down the values in the Location Id
column using the Table.FillDown function. After that, you can perform a simple Group By operation.
To make the Table.Group function start a new group each time there is a number value in the “Location Id” column, you can use a custom comparer function. Here’s how:
Table.Group(
Source,
{"Location Id"},
{{"Details", each _, type table },
{"Names", each Text.Combine(_[Name], ", ") }},
GroupKind.Local,
(x, y) => Number.From( y[Location Id] is number )
)
In the custom comparer function’s arguments, x
is the current value, and y
is the next value to compare. Since we are using GroupKind.Local, the function respects the order of the values. In this case, it’s enough to build a custom comparer that only tests y
, the next value.
So what is happening in the above code? For each value the function encounters, the function tests if the expression y[Location Id] is number
returns true. Each time this test evaluates to true, meaning y
is a number, the function starts a new group.
Important
Whenever you define a custom comparer function, the outcome should be -1, 0 or 1. In the above example, we return 0 or 1 by applying the Number.From function on the outcomes of a condition, which are either true
or false
.
The outcome of this is:
The custom comparer function made sure that even if there are null
values in the “Location Id” column, they will be grouped with the preceding non-null “Location Id”.
However, this isn’t the only way. Instead of checking whether the data type is a number, you can also test whether the value itself is not null
. This starts a new group whenever the value is not null
:
Table.Group(
Source,
{"Location Id"},
{{"Details", each _, type table },
{"Names", each Text.Combine(_[Name], ", ") }},
GroupKind.Local,
(x, y) => Number.From( y[Location Id] <> null )
)
The outcome of this query is identical to the one above.
Important
The takeaway from this is that when you’re working with GroupKind.Local, you don’t have to define a custom function that is symmetric. It is enough to define when a new group should start by evaluating y
. However, here it is also valid to provide a symmetric comparer function.
Download
To play with this code yourself, you can download the file below:
Conclusion
As this article showed, the fifth argument of Table.Group provides additional flexibility. You can group your values without creating any additional columns. However, it’s important to note that any custom comparer function you create will not fold. Therefore, if performance is of importance, consider adding separate columns and group your data by these instead. Your operations may actually fold.
Recommended Reading…
Power Query Foundations
Beginners Guide to Power Query
List Functions (200+ Examples)
Text Functions (150+ Examples)
Creating Tables (40+ Examples)
Generate the Ultimate Date Table
Advanced Topics
Master List.Generate with Easy Examples
Deep Dive into List.Accumulate
Great article Rick!
I guess there is a typo in the 2nd Comparer.Ordinal :
// For case sensitive comparisons
Comparer.Ordinal( “Duck”, “duck” ) // Output -1
Comparer.Ordinal( “Duck”, “duck” ) // Output 0
Comparer.Ordinal( “duck”, “Duck” ) // Output 1
¿Should it be “Duck” instead of “duck”?
Giovanni – Thanks for mentioning it. That was a typo. It’s fixed now 🙂
Congrats !! excelent optionals functions for Group by.. though it could be leading to weird results when grouping by two columns cause in the same record it appears (inside the output table) fields that belongs to another one: ie:
Record with DAY,SHIRT in columns : the output in the table provide Day, shirt and pants which may lead to a wrong conclusion.
Cols cols Table
DAY | SHIRT | DAY | PANTS
thanks Again
Ginés
Hi GinesC,
Thanks for your comment. I understand your concern, but I couldn’t find any instance where the grouped columns in the file I shared have different values from the table objects. If the group columns had ‘Day’ and ‘Shirt’, the table object shouldn’t contain ‘Day’ and ‘Pants’, and I agree with that. The only difference in the file is the capitalization of the values, which is exactly what I was trying to show.
Best regards, Rick
Great Post! I shall rethink back my logic whenever I use this Table.Group functions. Thanks Rick!
That’s great Tan. Just remember, it breaks query folding. So if you work with a database, you may want to simply add separate columns. Enjoy!