Get Value from Previous Row using Power Query

In this post you learn how to get the previous row value using Power Query. This is a useful skill to compare previous row value to the next. An example use case could be that your data shows cumulative numbers. Subtracting the previous cumulative number gets you the mutation between the two. Or you may want to find the previous date that was reported and calculate the difference between the two dates. This allows you to easily check if there missing records. In these cases and more you benefit from retrieving previous row values. So how do you go about that?

1. Requirements

When returning the previous row it is important to be clear on your requirements. What do you consider a previous row? Often this is dependent on a specific sort order. When you order your data by date or datetime and decide an earlier record is your previous row, this is a single-level hierarchy. The logic then depends on a single field. Or perhaps you want to take into account both a date and a category. In the latter case you end up applying logic to a multi-level hierarchy. Check out my follow-up post to learn how to handles these.

2. Methods in Single Level Hierarchy

2.1. Using Index Columns and Merge

In a single-level hierarchy retrieving the previous row is relatively easy. What you need is a way to reach the previous record. You can do that by using index columns. Imagine the below dataset:

Dataset with task and dates

Your goal is to find out how many days are between the current row and the previous row in the data. The first thing you should do is sort your data. Without sorting your data, you are not guaranteed the correct sort order. In this example, the date column determines the correct sort order. Click on the arrow in the Date column header and select Sort Ascending

To access the previous date you need a way to match the current row with the preceding row. You can do that by adding two index columns. An index column is a column that shows a sequence of consecutive numbers.

To create an index column, go to Add Column, select the dropdown arrow next to Index Column and click on From 0. This adds a number series starting from 0 and incrementing 1 with each row.

Picture showing how to add an index column

Add another index column this time starting From 1. This results in below dataset.

Dataset with two index columns included

With this in place you have all the requirements to retrieve the previous row value. You will use the Merge Queries functionality for this. Have a close look at row 2. Column Index 1 has a value of 1. The earlier row has an index of 1 in the column Index 2. The trick to get the earlier value is to Merge the query with itself, doing a Left Outer Join matching column Index 1 with column Index 2.

To do that, go to the Home tab, select Merge Queries. In the pop-up menu that appears you can select a table to merge with. In the dropdown select the current query. You are merging the data with itself. Select the column Index 1 in the upper table and Index 2 in the lower table. You can leave the join kind as Left Outer. Notice how 6 out of 7 rows are matched. This makes sense, because there is no previous row for the first line.

Merging columns pop-up screen in Power Query

Click OK and expand the newly created column. All columns that you expand will be the values from the previous row. In this case, select the Date column only and click okay.

Previous row data

The result is a column that contains the date of the earlier row. You can now delete the Index columns.

With this new value you can now calculate the difference between the date of the current and previous row. First select the column Date, hold ctrl and select the column Date.1. Then go to the Add Columns tab, select Date and press Subtract Days.

= Table.AddColumn(
     #"Expanded Added Index1", 
     "Subtraction", 
     each Duration.Days([Date] - [Date.1]), 
     Int64.Type
)
calculating a mutation between two dates

In a similar fashion you can get the value from the next row. Instead of Merging Index 1 with Index 2, you can merge Index 2 with Index 1. Merging these retrieves the next column value. And if you want to adjust this to retrieve the 5th value, you can adjust the Table.AddIndexColumn function to start at a different index.

Our example starts with an index at 0 with the code:

= Table.AddIndexColumn(#"Sorted Rows", "Index 1", 0, 1, Int64.Type)

An adjustment to argument 3 would make the Index column start at 5 and increment from there.

= Table.AddIndexColumn(#"Sorted Rows", "Index 1", 5, 1, Int64.Type)

2.2. Using Index Column and Reference

Another way to refer to the previous row value is by referencing cells directly. You can repeat the steps from chapter 2.1 up till adding the first index column. We start with below dataset.

Dataset including an index column

In Power Query you can reference a value with the following construct:

= StepName[ColumnName]{ZeroBasedIndex}

The zero based index indicates which row to retrieve, whereas the ColumnName shows which column to get the data from. To retrieve date value from the second row you can add to a custom column:

= #"Added Index"[Date]{1}

#”Added Index” is the previous step name, the column name is Date. Lastly you retrieve the second value by adding {1} using Power Query’s zero-based index. The newly added column now has the second date value in each row.

Result of a direct reference to a cell value in Power Query

This is the basic structure you can use. The next step is to make this dynamic. Instead of hardcoding the index number in the formula you can reference the Index 1 column. You can do that by changing the code to:

= #"Added Index"[Date]{ [Index 1] }

Since you want the value of the previous row you can still adjust this code to:

= #"Added Index"[Date]{ [Index 1] - 1 }

As you will see, the first line will now result in an error saying: “Expression.Error: The index cannot be negative.” That is because the formula tries to perform the following on the first line:

= #"Added Index"[Date]{-1}

For your final solution you will need to add some error checking to the formula:

= try #"Added Index"[Date]{-1} otherwise null

Try and otherwise can catch errors. When the try expression results in an error, the function returns the value after the otherwise clause.

Just like with the earlier example, you can also adjust this solution to retrieve the next row or the row that came 4 positions earlier. All you do is change the index number.

2.3. Attaching Previous Row Data

Another way to retrieve the previous row value involves some more complex M code. It is based on Imke’s solution. Instead of adding a separate index column, this solution sticks the previous row data next to the current data. Let’s start from here.

Dataset with tasks and dates

The first step is to retrieve the list of values we need using the List.RemoveLastN function. To learn how this function works you can find a range of examples in the complete guide to lists.

In our case we want the find the previous date. Since you want to compare the value to the previous row, the logic adds a null to the start and removes the last value. In the following code SortedRows is the previous stepname:

= {null} & 
List.RemoveLastN( Table.Column( SortedRows, "Date" ), 1 )
A list of shifted data starting with a null value

This creates a step that I named ShiftedDates. The only thing we still need to do, is stick the data in these 7 lines next to the data we already have. To achieve this you can perform the following steps.

First you transform the columns into a list, stick the ShiftedDates list to it, in a step called ListOfLists.

= Table.ToColumns( SortedRows ) & {ShiftedDates}

Then to get everything back together you can use:

= Table.FromColumns( 
     ListOfLists, 
     Table.ColumnNames( SortedRows ) & {"Previous Date"} 
  )

Notice how Previous Date is the column name for the previous row data.

The shifted data attached to the original data.

And that’s three ways on how to get the previous row value using Power Query. Note that this is all respecting a single level hierarchy. For multi-level hierarchies check out my follow-up post.

The examples I showed gave you an idea on how to get the earlier row. But with some slight adjustments you can also get the value from another row. You may want to reference values that come 5 records earlier or perhaps later in the data. All it takes is some slight adjustments.

I’m curious if you know any other ways to retrieve the previous row value. If you do, I would love to learn about them in the comments.

Enjoy Power Query!

About Rick de Groot

Rick is the founder and editor of BI Gorilla. He believes learning is one of the great pleasures in life and wants to share his knowledge to help you improve your skills.

Learn more about him here, connect with him on Twitter, Facebook and LinkedIn and subscribe to his YouTube Channel.

5 thoughts on “Get Value from Previous Row using Power Query”

  1. Rick, thank you for this post. I could not find your follow up post where you were going to give some guidance on how to do this with a multi level hierarchy.
    In my case I have a running list with monthly spend by customer by month and I need my new column to only reflect the change if the customer ID is the same.
    Sample “Wrong “Data in row 3

    Customer ID Current Month Previous Month Difference
    12345 $500 $500 $0
    12345 $500 $500 $0
    22222 $0 $500 -$500

    Desired Output (row 3 is correct)
    Customer ID Current Month Previous Month Difference
    12345 $500 $500 $0
    12345 $500 $500 $0
    22222 $0 $0 $0

    I’m trying to make sure that the “difference” does not take into account the previous month if it’s the 1st occurrence of the new customer ID.

    Thanks!

    Reply
    • Hi Fish,

      If you like you can have a look at this code, that should give you the right result. First copy this code into the advanced editor, and name this query ‘Dataset’. This query makes sure the currencies are in the columns instead of rows.

      let
      Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText(“i45WMjDUBSIjA0NLJR0l19AgIGmgZ64Uq4MmFRrsApaygEgZY9NliCkH1waUiwUA”, BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, CUR = _t, Rate = _t]),
      #”Changed Type” = Table.TransformColumnTypes(Source,{{“Date”, type date}, {“CUR”, type text}, {“Rate”, Int64.Type}}),
      #”Pivoted Column” = Table.Pivot(#”Changed Type”, List.Distinct(#”Changed Type”[CUR]), “CUR”, “Rate”, List.Sum)
      in
      #”Pivoted Column”

      After that, paste the below code in a separate query. It makes use of the ‘Dataset’ query, so the naming is important. This query creates a list of dates, starting at the minimum date of the Dataset query, and ending at the maximum date of the Dataset query.

      It then fills down the missing values so the dates that miss get the previous value of the currency. Lastly, it unpivots the values to return to the original form. Hope that helps!

      let
      Source = List.Dates( List.Min( Dataset[Date] ), Duration.Days( List.Max( Dataset[Date] ) – List.Min( Dataset[Date] ) ) + 1, #duration( 1, 0, 0, 0 ) ),
      #”Converted to Table” = Table.FromList(Source, Splitter.SplitByNothing(), { “Dates” }, null, ExtraValues.Error),
      #”Changed Type” = Table.TransformColumnTypes(#”Converted to Table”,{{“Dates”, type date}}),
      #”Merged Queries” = Table.NestedJoin(#”Changed Type”, {“Dates”}, Dataset, {“Date”}, “Base”, JoinKind.LeftOuter),
      #”Expanded Base” = Table.ExpandTableColumn(#”Merged Queries”, “Base”, {“Date”, “EUR”, “USD”}, {“Date”, “EUR”, “USD”}),
      #”Sorted Rows” = Table.Sort(#”Expanded Base”,{{“Dates”, Order.Ascending}}),
      #”Filled Down” = Table.FillDown(#”Sorted Rows”,{“EUR”, “USD”}),
      #”Removed Columns” = Table.RemoveColumns(#”Filled Down”,{“Date”}),
      #”Unpivoted Other Columns” = Table.UnpivotOtherColumns(#”Removed Columns”, {“Dates”}, “Attribute”, “Value”)
      in
      #”Unpivoted Other Columns”

      Reply
  2. Amazing! I’ve reached this for multiple task and exactly trying not to merge current table for performance issue.

    I’ve been trying to input the referencing of a previous row using the index column and reference method but stuck at the syntax for referencing each table.

    I did until (r) => “I think I need something here”{Index-1}[Rates] but unable to reference that nested table.

    Reply
    • Hi Ken,

      You can reference the column and position, but make sure to add the name of the previous step too.

      = #”Added Index”[Date]{ [Index 1] – 1 }

      Before example had “Added Index” as previous step name. Power Query needs that 🙂

      Rick

      Reply

Leave a comment