Data comes in countless ways and forms. Sometimes structured, with an SQL database, a table in Excel or several CVS files. The fun with Power Query starts when the delivered format is quite messy.
It works to your advantage to be able to handle different scenarios. This post focuses on how to transform data from a single column, but that has a repetitive pattern, into a table. Spoiler, we will use a Modulo column.
Getting Started
Below example shows a repetitive pattern with 5 unique categories. After that data for the first category repeats.
To get started:
- Select the Starting Data -> go to tab Data -> click from Table/Range
With the data in Power Query, you now need a way to get the categories in the columns. Unfortunately, transposing the data does not help. It creates many columns, but still doesn’t group the right data.
Another approach is using an Index column. If there is a pattern in the data, it becomes more visible after you add an Index column.
- Go to Add Column -> click the arrow next to Index Column -> select From 0
With the Index column provided, let’s look at what pattern appear. It becomes clear that the first 5 rows are the shown categories (note that the Index Column starts at 0, so number 4 is actually the 5th value).
Modulo Column
If the Index Column would have numbers from 0 to 4 and start over again, that would provide you with a categorization that you can work with. And that’s exactly what Power Query does when creating a Modulo Column based on the Index column.
A modulo column takes the values of an existing column, divides them by a specified amount and returns the remainder. To do this:
- Select column Index
- Go to tab Add column -> Standard (operation) -> click Modulo
- Fill in the number 5
The added column now shows a series starting from 0 and going to 4, before repeating the pattern again.
Pivot Data
With this column, you can now pivot the data. In that way it turns into a table, which is much easier to work with. Do pay attention. When pivoting the Inserted Modulo column, the standard operation Power Query performs is to add up to numbers, or else count text values.
In this case, it’s better not to aggregate any data. To do that:
- select column Inserted Modulo -> click the tab Transform -> click Pivot Column
- as Values Column specify Data
- before clicking OK go to Advanced options -> select Don’t Aggregate
For your understanding, it can be good to test what happens if you forget to select Don’t Aggregate. That would present the data in a useless way for this exercise. Yet, it’s good to be aware this is the standard operation.
The result of pivoting the data is that it’s laid out in a tabular form. The biggest issue left is that the data relating to a single row, is scattered over five different rows. In an ideal world there would only be three rows with data.
Cleaning up
To get there, you start out with the Fill Up functionality.
- Select column 1 -> hold down shift -> left-click on column 4
- Go to the tab Transform -> select Fill -> press Up
We’re almost there. To keep the three needed rows, you need one more step:
- Filter out null values from column 0, containing the dates.
The table is now left with three rows. The only thing left is to clean up. Before exporting the data to the excel worksheet:
- Delete the column Index (as we don’t need it anymore)
Change data type from column:
- 0 to Date
- 1 to Whole Number
- 2 and 3 to Text
- 4 to Decimal Number.
And rename columns:
- 0 to Date
- 1 to Material Nr
- 2 to Category
- 3 to Name
- 4 to Sales
Result
Then load the query to the worksheet.
If you like to follow these steps with the Excel file, have a look here: Example File – Structure Data Using Index Modulo Column.
Conclusion
This post showed a way to handle data that comes in a single column, yet has a repetitive pattern. With only a few steps you turn a single column of data into a table.
You won’t need this every day, but the skill might prove handy some day. If you like this skill, make sure to also check out how you can Group your Data and reach into Reach into Underlying Rows right after.
That’s all for now. If this post added value, please share it with your network. And I would love to read comments with questions or suggestions.
Recommended Reading…
Power Query Foundations
Beginners Guide to Power Query
List Functions (200+ Examples)
Text Functions (150+ Examples)
Creating Tables (40+ Examples)
Generate the Ultimate Date Table
Advanced Topics
Master List.Generate with Easy Examples
Deep Dive into List.Accumulate
What would be your suggestion if a source table is super big (talking in GBs), you have already cleaned as much as possible and the pivoting just won’t work (getting error on missing memory). My stacked column just contained different value categories so my original thought was to pivot them. For now I have separated different values into separate tables but I am not sure if this is a good approach or would e.g. creating tables based on year or other category better?
Hi Rick,
I am leaving this comment to express how grateful I am for this post. I had been crawling the web for hours looking for the solution before stumbling upon this page. It is exactly what I was looking for.
Thanks!
What when data is not consistent like.
after sales there is add discount in next row but when 2nd sales came there will not.
any solution ?
Hi AHmed. I haven’t come up with a solution yet… Hope to get back to you later. Thanks for posting!
thank a lot , perfect job
Hey. Thanks for this post. It will really come handy. I think one step is missing. The step is to remove duplicates after you filter out the nulls.
He Omisile! Thanks for commenting. I tried replicating your point, but didn’t bump into any duplicates. The fill up operation was performed on the columns with number 1 to 4. It is column 0, with the dates, where we filter out the null’s. With the steps taken this column does not contain any duplicates. I get the desired results without a remove duplicates operation. Do feel free to include it in your query as it doesn’t hurt.