Written By Rick de Groot

Rick is the founder of BI Gorilla. He believes learning is one of life's greatest pleasures and shares his knowledge to help you improve your skills.

In this post, you learn how to group your data and concatenate text values. With the right function, you can concatenate strings into a single cell using the group by functionality. The effect is similar to using CONCATENATEX in DAX. So how can you use group by to concatenate text values in Power Query?

Most users know the Group By functionality in Power Query. It allows you to summarize data and aggregate the grouped values. The standard aggregations are Average, Median, Min, Max, Count Rows, Count Distinct Rows and All Rows. Most of these are aimed at numbers. Yet Power Query can also perform operations relevant to text.

Table of contents
Concatenate Text Values using Group by in Power Query (w/ Optional Delimiter)

1. The dataset

Imagine the data set below. You summarize the rows so that only a single row is available for each product. On top of that, you sum the grouped sales values. Both operations can be done using the user interface.

The challenge is to concatenate the grouped text values of the sellers into a single column. You will write some custom code to solve this. Let’s get going!

Data set for grouping

To summarize the data Product and then Sum the underlying sales, you can:

  • Go to the Home Tab -> Click Group By
Using the Group By functionality

In the Pop-up screen:

  • Click Advanced
  • Group By -> Product
  • In the aggregation section, add a new column name called ‘Sales’, set the operation to Sum and select Column ‘Sales’.

So far, everything can be done by the user interface. The challenge left is to create one that combines all seller names in a comma-separated list. The user interface does not provide the right code.

Yet it does provide a template that shows how to use an aggregation function within the Table.Group function. Let’s go ahead and add a ‘dummy’ aggregation that prepares the code for the next step. What you can do is:

  • Click on Add aggregation, name the column Sellers, use the Sum Operation on the Seller column. Click OK.
Group By menu to specify aggregations

The operation groups the data by Product, and sums the underlying sales. Notice that the Seller column contains an error. This error occurs since we told Power Query to perform a Sum operation on text values.

Clearly, the Sum operation only works on numbers and won’t concatenate text values.

Error message when summing text values

Yet, for our purposes, this setup is great. We can now correct the wrong formula without having to write everything from scratch.

2. How does Table.Group work?

Before we can change the code, it is good to have a look at how Power Query groups data. When grouping data, Power Query uses the Table.Group formula.

The Table.Group formula takes three arguments. It needs:

  1. a table. In this case, the step #”Changed Type” refers to a table.
  2. the key columns. These columns specify the columns to group the data by. In this case, the column named “Product” is the key to group the data.
  3. aggregated columns as list. The formulas that are used to aggregate the data should be able to handle a list. The underlying data is always provided as a list for aggregations. In the example, the formula List.Sum is used to sum the sales numbers.

Focus on step 3 for a moment, as this argument provides a crucial hint for what formulas you can use to summarize text values. The aggregation formula needs to be able to handle a list as a variable.

When searching the Power Query formula reference, you may stumble upon the Text.Combine function. The Text.Combine function combines text and takes a list as input. The formula then returns a text value that is the result of joining all underlying text values. And each value that is concatenated, can be separated using a separator.

As input arguments, the formula requires text to concatenate (as a list), and it needs to know with what separator to use. This is exactly what we need for our challenge.

3. Concatenate Text Values

3.1. Table.Group with Text.Combine

You now know which function to use to concatenate text values in Power Query. It is time to adjust the M-code generated by the user interface. We can try to incorporate the Text.Combine function in the formula containing the error. The part of the formula returning an error is:

{"Seller", each List.Sum( [Seller] ), type text}

To have it use the Text.Combine function you can change it to:

{"Seller", each Text.Combine( [Seller] , ", "), type text}

The first argument indicates which column (received as a list) provides the text to combine. In this case, the column ‘[Seller]’.

The second argument tells us what separator to add between the text. The example contains a comma and a space (“, “). After pressing Enter, the screenshot below is the result.

text.combine function to aggregate text values

The names of the Sellers are now all within the same cell and don’t return an error anymore. Goal reached! You have successfully combined the group by operation with the Text.Combine function.

3.2. All Rows and Extract Values

Another way to use Group By to concatenate rows is by using the All Rows functionality. Imagine we again group the data by Product. You take the sum of the Sales column, but this time, you also select the All Rows operation.

1. All Rows operation using Group By

The result of the All Rows operation is a Table object containing all grouped records, you can see it below in the column Details.

2. Table object from All Rows Operation

When working with tables, you can retrieve the value from a column by referencing it between square brackets. A custom column with the formula = [Sales] results in the corresponding value within that column.

3. Column reference in same step

Yet when you reference a column in a different table or an earlier step in your query, a column reference results in a list containing all values from that column.

Tip

To learn more about the syntax of lists and learn from examples, make sure to check out:
Lists in Power Query M / List Functions (200+ Examples).

4. Column reference in previous step

Knowing this, we can cook up a formula to reference all seller values in the table object of the Details column. To retrieve the Seller names, the formula first references the Details column that contains the table object.

And on top of that reference, you reference the Sellers column within the table object. You can do that by writing:

= [Details][Seller]
5. Reference column values from Table as list

On the bottom left, you can preview the grouped sellers found within the list item. The last step is to extract the values from this column. To do that:

  • In the Custom column, select the outward-facing arrows and select Extract Values.
  • Choose a Custom delimiter and write “, ” (without parenthesis), then press OK.
6. Extract list values to comma separated list

The resulting formula looks very similar to what we wrote earlier:

= Table.TransformColumns(
     #"Added Custom", 
     {"Custom", 
         each Text.Combine(List.Transform(_, Text.From), ", "), 
         type text })

The only difference is that this formula includes the List.Transform and Text.From function. This addition is relevant for value types that are different from text.

Tip

If list functions like these are new to you, make sure to check out this complete guide to lists in Power Query.

It took some slight adjustments of the M-Code, but nothing you can’t handle. You will get more comfortable with making manual changes the more you do it. To take a closer look, you can download the example file:

Conclusion

In this post, you have learned that when you group data, only aggregations are possible with functions that allow lists as input. Knowing this, you can leverage functions that take a list as argument, when grouping your data.

I hope this post opens your mind to more experimentation. If you have any questions or stumble upon useful formulas to leverage when grouping, I’d love to hear about them in the comments.

Enjoy Power Query!

Share on:
  1. I have a dataset in a row some begins with KR-1111-1111-111 and some begins with 123045, I want to remove duplicate only from the KR No how will i do that and which formula I will apply.

    Reply
  2. I just love these website, and using this trick a lot!

    And Today I faced another challenge: what is I want to do the sorting not by ascending/descend but by another column?

    Text.Combine( 
      List.Sort( 
        List.Distinct([columnB] ), 
        Order.Ascending[columnA] 
      ),
      "; "
    ), type nullable text} 

    Like listagg in SQL:
    ,listagg(distinct columnB, ‘ + ‘) within group (order by columnA)

    Is it possible in Power Query?
    Thank you!

    Reply
  3. Just a thank you for this and other articles you’ve published! Your website often provides me with a solution when looking for Power Query help.

    Reply
  4. Way cool! Some very valid points! I appreciate you writing
    this post and the rest of the website is really good.

    Reply
  5. Here’s a little snippet I have been using to get distinct values in a sorted order. Thank

    Text.Combine( 
      List.Sort( 
        List.Distinct([] ), 
        Order.Ascending 
      ),
      "; "
    ), type nullable text} 

    HTH

    * Thanks to whomever showed me the List.Distinct approach. I was having a heck of a time removing duplicate [port numbers, or App Codes, or IP Addresses] in my concatenation attempt.

    Reply
  6. Hi, I am using the group by function and then text.combine but I have run into a problem because I work with a big file and my result exceeds to maximum characters in a singel excel cell.
    Do you know of any simple way to split a grouped row into many rows if total characters exceeds max?

    Reply
    • That sounds very familiar. You can first concatenate your values. And then once the string is put together use something like:

      Table.SplitColumn( 
         Source, 
         "Add Article_HTML", 
         Splitter.SplitTextByRepeatedLengths( 32766 ), 
         { "col1", "col2", "col3" }
      )
      Reply
  7. Excellent. It always seems so intuitive once someone walks you through it.
    I’d been trying to group geographic coordinates into a single single string so I could convert it into WKT format for use with the IconMap Visual (a geofence polygon), then make a heatmap for number of deliveries per zone.

    It worked wonderfully and I’m feeling kinda goofy after trying to write lots of custom code, when the user interface was able to take me 90% of the way.

    Cheers, Rick!

    Reply
  8. Beautiful. Thank you. I used this to create a Reference table and show what product types were associated with an opportunity ID. I couldn’t use a lookup because of the Many to Many issue, and this solved that.

    Reply
  9. Hi Rick,

    I was wondering if this little trick could also work inside the Table.AggregateTableColumn
    function. In principle it worked, however I failed in adding the separator argument, since the aggregations seem to either work differently or pass the arguments in a slightly different way?
    Do you have any idea?

    This works:

    Table.AggregateTableColumn(
       #"Merged Queries", 
       "My_Lookup_Table", 
       {{"Packagecode", Text.Combine, "Combined Packagecodescodes"}})

    This doesn’t (also adding the ‘each’ statement doesn’t work):

    Table.AggregateTableColumn(
       #"Merged Queries", 
       "My_Lookup_Table", 
       {{"Packagecode", Text.Combine([Packagecode], ", "), "Combined Packagecodescodes"}})
    Reply
    • Hi Marc,

      That’s an interesting challenge. I didn’t try this before you asked. So your code is working like:

      Table.AggregateTableColumn(
         #"Merged Queries",
         "My_Lookup_Table",
         {{"Packagecode", Text.Combine, "Combined Packagecodescodes"}})

      To make it working with a separator you can do:

      Table.AggregateTableColumn(
         #"Merged Queries",
         "My_Lookup_Table",
         {{"Packagecode", each Text.Combine( _ ", " ), "Combined Packagecodescodes"}})

      You were on the right path using the each keyword. You can then reference the “_” (underscore), to indicate it’s for each value within that column, and then add the separator as argument.

      Hope that helps and thanks for asking your question.

      Best regards,
      Rick

      Reply
      • Hi Rick,

        that was the missing link! Thank you so much, it works as expected and saved me quite some workarounds… AWESOME! Thanks 🙂

        kind regards,
        Marc

        Reply
  10. Thanks! Now can you help doing the same with a date column? I know somewhere you need to do a Date.ToText but it’s not working.

    Reply
    • Hi H,

      Great question, that’s a little more difficult. What you can do is:

      {"Seller", 
      each Text.Combine( 
              List.Transform( _ , (DateList) => Date.ToText( DateList ) ) , ", " ), 
      type text } 

      The part with List.Transform is added, so that Power Query can first transform each Date that needs to be grouped. Only to then combine them together.

      Hope this helps!
      Rick

      Reply
      • When I try this code it gives me the error:

        Expression.Error: We cannot convert a value of type Table to type List.
        Details:
        Value=[Table]
        Type=[Type]
        I need to concatenate all of the dates as text separated by a comma. I have tried lots of approaches but I always get the same result. I might need to look more at the meaning of _ and how to convert the date column but do you have suggestions?

        Reply
        • Thanks for your help. I took you code exactly as written and I should have added the [Date] after the _ … I did this and it works great!

          Reply
  11. Great! This helped me a lot.!
    What about if you just want to group and take the values for the latest one. Let me explain:
    In the example, let say you just want to now the values where the date is the oldest. Then you should have:
    19/04/2018 | Face Cream | Jaime | 90.07
    22/04/2018 | Hand Soap | Lisa | 74.05
    ..etc

    Reply
  12. Thanks Rick, this helpled me a lot! A quick question: is there a quick way to remove potential duplicate values in the newly created column? Many thanks!

    Reply
    • Hi Anne-Sophie,

      That’s a great question. This is the answer I gave to someone else in the comments:

      To get a list of UNIQUE seller names, you can add List.Distinct to the formula.

      The original formula for grouping in my example file is:

      = Table.Group( #"Changed Type", 
          {"Product"}, 
          {{"Sales", each List.Sum([Sales]), type number},
           {"Sellers", each Text.Combine([Seller], " , "), 
           type text} } )

      You can change this to the following to get a unique list of names

      = Table.Group( #"Changed Type", 
          {"Product"}, 
          {{"Sales", each List.Sum([Sales]), type number},
           {"Sellers", each Text.Combine( List.Distinct( [Seller] ), " , "), 
           type text} } )

      Hope that helps!

      Reply
  13. Can i accomplish this same csv text column result using #”Aggregate . . . ” given it in a left join scenario to produce the one left side hit for many right side hit results using #”Group By . . . ” requires defining every left side table column value you want included in the output results?

    Reply
    • Hi,

      Could you send me an example file in Excel or Power BI and describe the desired result? It’s not completely clear to me yet.

      With regards, Rick

      Reply
  14. This is a great article, thank you so much for posting it! One other question, how do you update the new tables if new information is added to the original worksheet Source. I tried clicking on the ‘Refresh All’ in the Data Tab, but it didn’t add the new row of information from the Source Worksheet. Thank you!

    Reply
    • Hi Stephanie,

      A ‘Refresh All’ should normally do the trick. In what form is your source data? Is it in a table and does Power Query pick up the table? If it’s not in a table, it could be that Power Query picks up the named range. Then if this named range is not including the additional data, it won’t be picked up.

      Best regards,
      Rick

      Reply
    • Hi Rachel.

      The grouping is required. But what you could do to keep all your columns, is to specify an additional column in the grouping screen. And in this extra column, select the operation : All Rows. After grouping, you will then see an additional column containing all the columns that were grouped earlier.

      Reply
  15. Hi
    Just wanted to say this is the first article I read on this website and I can tell it has some really good content with great examples.
    On a side note: just bookmarked this site right next to Stackoverflow.

    Reply
  16. Nice tip!!

    but how to make it a unique SELLER NAMES for eg!

    Sun Lotion 2951.06 Richard,Lisa,Dennis,Mark,Mark,James,James,James

    Req Result:

    Sun Lotion 2951.06 Richard,Lisa,Dennis,Mark,James

    Reply
    • Hi Muhammed. That’s a great question! To get a list of UNIQUE seller names, you can add List.Distinct to the formula.

      The original formula for grouping in my example file is:

       = Table.Group( #"Changed Type",
         {"Product"},
         {{"Sales", each List.Sum([Sales]), type number},
         {"Sellers", each Text.Combine([Seller], " , "),
         type text}}) 

      You can change this to the following to get a unique list of names

       = Table.Group( #"Changed Type",
         {"Product"},
         {{"Sales", each List.Sum([Sales]), type number},
         {"Sellers", each Text.Combine(List.Distinct([Seller]), " , "),
         type text}}) 
      Reply
  17. Hi Jim. There was no example file before. To help you out, I added an example file just before the last paragraph now. Hope it helps you!

    Reply

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.