Get Value from Previous Row using Power Query

In this post you learn how to get the previous row value using Power Query. This is a useful skill to compare previous row value to the next. An example use case could be that your data shows cumulative numbers. Subtracting the previous cumulative number gets you the mutation between the two. Or you may want to find the previous date that was reported and calculate the difference between the two dates. This allows you to easily check if there missing records. In these cases and more you benefit from retrieving previous row values. So how do you go about that?

1. Requirements

When returning the previous row it is important to be clear on your requirements. What do you consider a previous row? Often this is dependent on a specific sort order. When you order your data by date or datetime and decide an earlier record is your previous row, this is a single-level hierarchy. The logic then depends on a single field. Or perhaps you want to take into account both a date and a category. In the latter case you end up applying logic to a multi-level hierarchy. Check out my follow-up post to learn how to handles these.

2. Methods in Single Level Hierarchy

2.1. Using Index Columns and Merge

In a single-level hierarchy retrieving the previous row is relatively easy. What you need is a way to reach the previous record. You can do that by using index columns. Imagine the below dataset:

Dataset with task and dates

Your goal is to find out how many days are between the current row and the previous row in the data. The first thing you should do is sort your data. Without sorting your data, you are not guaranteed the correct sort order. In this example, the date column determines the correct sort order. Click on the arrow in the Date column header and select Sort Ascending

To access the previous date you need a way to match the current row with the preceding row. You can do that by adding two index columns. An index column is a column that shows a sequence of consecutive numbers.

To create an index column, go to Add Column, select the dropdown arrow next to Index Column and click on From 0. This adds a number series starting from 0 and incrementing 1 with each row.

Picture showing how to add an index column

Add another index column this time starting From 1. This results in below dataset.

Dataset with two index columns included

With this in place you have all the requirements to retrieve the previous row value. You will use the Merge Queries functionality for this. Have a close look at row 2. Column Index 1 has a value of 1. The earlier row has an index of 1 in the column Index 2. The trick to get the earlier value is to Merge the query with itself, doing a Left Outer Join matching column Index 1 with column Index 2.

To do that, go to the Home tab, select Merge Queries. In the pop-up menu that appears you can select a table to merge with. In the dropdown select the current query. You are merging the data with itself. Select the column Index 1 in the upper table and Index 2 in the lower table. You can leave the join kind as Left Outer. Notice how 6 out of 7 rows are matched. This makes sense, because there is no previous row for the first line.

Merging columns pop-up screen in Power Query to retrieve previous row

Click OK and expand the newly created column. All columns that you expand will be the values from the previous row. In this case, select the Date column only and click okay.

Previous row data

The result is a column that contains the date of the earlier row. You can now delete the Index columns.

With this new value you can now calculate the difference between the date of the current and previous row. First select the column Date, hold ctrl and select the column Date.1. Then go to the Add Columns tab, select Date and press Subtract Days.

= Table.AddColumn(
     #"Expanded Added Index1", 
     "Subtraction", 
     each Duration.Days([Date] - [Date.1]), 
     Int64.Type
)
calculating a mutation between two dates

In a similar fashion you can get the value from the next row. Instead of Merging Index 1 with Index 2, you can merge Index 2 with Index 1. Merging these retrieves the next column value. And if you want to adjust this to retrieve the 5th value, you can adjust the Table.AddIndexColumn function to start at a different index.

Our example starts with an index at 0 with the code:

= Table.AddIndexColumn(#"Sorted Rows", "Index 1", 0, 1, Int64.Type)

An adjustment to argument 3 would make the Index column start at 5 and increment from there.

= Table.AddIndexColumn(#"Sorted Rows", "Index 1", 5, 1, Int64.Type)

2.2. Using Index Column and Reference

Another way to refer to the previous row value is by referencing cells directly. You can repeat the steps from chapter 2.1 up till adding the first index column. We start with below dataset.

Dataset including an index column

In Power Query you can reference a value with the following construct:

= StepName[ColumnName]{ZeroBasedIndex}

The zero based index indicates which row to retrieve, whereas the ColumnName shows which column to get the data from. To retrieve date value from the second row you can add to a custom column:

= #"Added Index"[Date]{1}

#”Added Index” is the previous step name, the column name is Date. Lastly you retrieve the second value by adding {1} using Power Query’s zero-based index. The newly added column now has the second date value in each row.

Result of a direct reference to a cell value in Power Query

This is the basic structure you can use. The next step is to make this dynamic. Instead of hardcoding the index number in the formula you can reference the Index 1 column. You can do that by changing the code to:

= #"Added Index"[Date]{ [Index 1] }

Since you want the value of the previous row you can still adjust this code to:

= #"Added Index"[Date]{ [Index 1] - 1 }

As you will see, the first line will now result in an error saying: “Expression.Error: The index cannot be negative.” That is because the formula tries to perform the following on the first line:

= #"Added Index"[Date]{-1}

For your final solution you will need to add some error checking to the formula:

= try #"Added Index"[Date]{-1} otherwise null

Try and otherwise can catch errors. When the try expression results in an error, the function returns the value after the otherwise clause.

Just like with the earlier example, you can also adjust this solution to retrieve the next row or the row that came 4 positions earlier. All you do is change the index number.

2.3. Attaching Previous Row Data

Another way to retrieve the previous row value involves some more complex M code. It is based on Imke’s solution. Instead of adding a separate index column, this solution sticks the previous row data next to the current data. Let’s start from here.

Dataset with tasks and dates

The first step is to retrieve the list of values we need using the List.RemoveLastN function. To learn how this function works you can find a range of examples in the complete guide to lists.

In our case we want the find the previous date. Since you want to compare the value to the previous row, the logic adds a null to the start and removes the last value. In the following code SortedRows is the previous stepname:

= {null} & 
List.RemoveLastN( Table.Column( SortedRows, "Date" ), 1 )
A list of shifted data starting with a null value

This creates a step that I named ShiftedDates. The only thing we still need to do, is stick the data in these 7 lines next to the data we already have. To achieve this you can perform the following steps.

First you transform the columns into a list, stick the ShiftedDates list to it, in a step called ListOfLists.

= Table.ToColumns( SortedRows ) & {ShiftedDates}

Then to get everything back together you can use:

= Table.FromColumns( 
     ListOfLists, 
     Table.ColumnNames( SortedRows ) & {"Previous Date"} 
  )

Notice how Previous Date is the column name for the previous row data.

The shifted data attached to the original data.

And that’s three ways on how to get the previous row value using Power Query. Note that this is all respecting a single level hierarchy. For multi-level hierarchies check out my follow-up post.

The examples I showed gave you an idea on how to get the earlier row. But with some slight adjustments you can also get the value from another row. You may want to reference values that come 5 records earlier or perhaps later in the data. All it takes is some slight adjustments.

I’m curious if you know any other ways to retrieve the previous row value. If you do, I would love to learn about them in the comments.

Enjoy Power Query!

Recommend Reading

>>> Power Query - Foundations <<<
 List Functions (200+ examples)
 Text functions (150+ examples)
Creating Tables from Scratch (40+ examples)
 Understanding If Statements 
Generating a Date Table


>>>  Power Query - Advanced Topics <<<
Master List.Generate with Easy Examples
 Deep Dive into List.Accumulate
Create Calendar with Dynamic Language
Replacing Values (User Interface and Beyond)
Share on:
About Rick de Groot

Rick is the founder and editor of BI Gorilla. He believes learning is one of the great pleasures in life and wants to share his knowledge to help you improve your skills.

Learn more about him here, connect with him on Twitter, Facebook and LinkedIn and subscribe to his YouTube Channel.

12 thoughts on “Get Value from Previous Row using Power Query”

  1. Hello Rick, Thank you for this awesome video. I have a challenge that I would love you to help me with. I feel that this video is helping but I need to take it to the next level 🙂
    I have items that are being ordered monthly. Before injecting the items into my system, I need to make sure that the monthly order respect a certain minimum quantity (MOQ).

    Lets say item A is wanted in 01/01/2023 – 80000 pcs in 01/02/2023 – 40000 pcs and in 01/03/2023 – 80000 pcs. The problem is that the MOQ of this item is 150000 pcs.
    So since the monthly orders does not respect the minimum qty, I add up February and January so it is 120000 pcs. It is still under the MOQ. So I add up to that the march need. So it is 200000 pcs that I will have to inject into my system in 01.01.2023.
    The tricky part is that I should tell Query that if the MOQ is respected, then go to the next line, but if it is not, the add up the quantity below but leave it in the first required date.
    And adding up a little spices to that 😊, same exemple but with a 4th PO wanted in 01.04.2022 for 150000 pcs. Query should be able to return that the need for January is 200 K AND the next need is for April for 150K. (Since the MOQ is respected, there is no need to add up this need with the others).
    I hope I am clear. I did not find a solution for this request. Will really appreciate your help on this.

    Thanks a lot.

    Reply
  2. Hi Rick,
    This solved my problem of finding duplicate Employee Names perfectly, however after adding this 1 line of code the table jumps from ~8mb to >12GB! I stopped it at 12 but it was still loading. The table has 63,000 rows.
    = Table.AddColumn(#”Added Index”, “DuplicateEmpName”, each try #”Added Index”[EmpName]{[Index] -1} otherwise null).
    What have I broken?

    Reply
  3. Hi Rick,
    Thanks for this post, was looking for something similar (the multi-level hierarchy, maybe you do get some time to come up with the follow-up article).
    My comment/question actually tackles the multi-level hierarchy, specifically:
    – what would your approach be if you are trying to get the value from the previous row of the same category, which could or could not be the strictly previous row?
    For example imagine following table (a column of unique id’s (could be dates/datetime/indexes), a column of Category and a column of Values):
    uniqueID Categ1 some_value1 (index1=0, index2=1)
    uniqueID Categ1 some_value2 (index1=1, index2=2)
    uniqueID Categ2 some_value1 (index1=2, index2=3)
    uniqueID Categ1 some_value3 (index1=3, index2=4)
    uniqueID Categ2 some_value2 (index1=4, index2=5)
    Purpose is to get difference vs previous value, by category.
    When implementing the “merge” solution, when joining by the temporary indexes, the result will only be useful for row 2, because previous row is from the same category.
    Any idea?
    (tried doing something like this in DAX, but i ran out of memory so I was thinking PQ would be faster and less consuming 😀 )

    Reply
    • The one solution I came up with, although maybe there’s another way, was to split the initial table in two (luckily enough there are only two categories involved), perform the merge for each on them, get the info I needed, then append the two queries back together, with the new info already there.

      Reply
  4. Rick, thank you for this post. I could not find your follow up post where you were going to give some guidance on how to do this with a multi level hierarchy.
    In my case I have a running list with monthly spend by customer by month and I need my new column to only reflect the change if the customer ID is the same.
    Sample “Wrong “Data in row 3

    Customer ID Current Month Previous Month Difference
    12345 $500 $500 $0
    12345 $500 $500 $0
    22222 $0 $500 -$500

    Desired Output (row 3 is correct)
    Customer ID Current Month Previous Month Difference
    12345 $500 $500 $0
    12345 $500 $500 $0
    22222 $0 $0 $0

    I’m trying to make sure that the “difference” does not take into account the previous month if it’s the 1st occurrence of the new customer ID.

    Thanks!

    Reply
    • Hey Chris,

      You’re right, I didn’t make the follow up post yet. It’s relatively simple once you know it.

      In this post there is a step where I merge the index columns to retrieve the previous value. You can do the same merge, but make sure to have 2 columns to merge on. Both the Customer ID and the Index columns. With that addition, you should be able to respect the multi level hierarchy.

      Hope that helps!

      Reply
    • Hi Fish,

      If you like you can have a look at this code, that should give you the right result. First copy this code into the advanced editor, and name this query ‘Dataset’. This query makes sure the currencies are in the columns instead of rows.

      let
        Source = Table.FromRows(
          Json.Document(
            Binary.Decompress(
              Binary.FromText(
                "i45WMjDUBSIjA0NLJR0l19AgIGmgZ64U
                 q4MmFRrsApaygEgZY9NliCkH1waUiwUA", 
                BinaryEncoding.Base64
              ), 
              Compression.Deflate
            )
          ), 
          let
            _t = ((type nullable text) meta [Serialized.Text = true])
          in
            type table [Date = _t, CUR = _t, Rate = _t]
        ), 
        #"Changed Type" = Table.TransformColumnTypes(
          Source, 
          {{"Date", type date}, {"CUR", type text}, {"Rate", Int64.Type}}
        ), 
        #"Pivoted Column" = Table.Pivot(
          #"Changed Type", 
          List.Distinct(#"Changed Type"[CUR]), 
          "CUR", 
          "Rate", 
          List.Sum
        )
      in
        #"Pivoted Column"
      

      After that, paste the below code in a separate query. It makes use of the ‘Dataset’ query, so the naming is important. This query creates a list of dates, starting at the minimum date of the Dataset query, and ending at the maximum date of the Dataset query.

      It then fills down the missing values so the dates that miss get the previous value of the currency. Lastly, it unpivots the values to return to the original form. Hope that helps!

      let
        Source = List.Dates(
          List.Min(Dataset[Date]), 
          Duration.Days(List.Max(Dataset[Date]) - List.Min(Dataset[Date])) + 1, 
          #duration(1, 0, 0, 0)
        ), 
        #"Converted to Table" = Table.FromList(
          Source, 
          Splitter.SplitByNothing(), 
          {"Dates"}, 
          null, 
          ExtraValues.Error
        ), 
        #"Changed Type" = Table.TransformColumnTypes(#"Converted to Table", {{"Dates", type date}}), 
        #"Merged Queries" = Table.NestedJoin(
          #"Changed Type", 
          {"Dates"}, 
          Dataset, 
          {"Date"}, 
          "Base", 
          JoinKind.LeftOuter
        ), 
        #"Expanded Base" = Table.ExpandTableColumn(
          #"Merged Queries", 
          "Base", 
          {"Date", "EUR", "USD"}, 
          {"Date", "EUR", "USD"}
        ), 
        #"Sorted Rows" = Table.Sort(#"Expanded Base", {{"Dates", Order.Ascending}}), 
        #"Filled Down" = Table.FillDown(#"Sorted Rows", {"EUR", "USD"}), 
        #"Removed Columns" = Table.RemoveColumns(#"Filled Down", {"Date"}), 
        #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(
          #"Removed Columns", 
          {"Dates"}, 
          "Attribute", 
          "Value"
        )
      in
        #"Unpivoted Other Columns"
      
      Reply
  5. Amazing! I’ve reached this for multiple task and exactly trying not to merge current table for performance issue.

    I’ve been trying to input the referencing of a previous row using the index column and reference method but stuck at the syntax for referencing each table.

    I did until (r) => “I think I need something here”{Index-1}[Rates] but unable to reference that nested table.

    Reply
    • Hi Ken,

      You can reference the column and position, but make sure to add the name of the previous step too.

      = #"Added Index"[Date]{ [Index 1] - 1 }
      

      Before example had “Added Index” as previous step name. Power Query needs that 🙂

      Rick

      Reply

Leave a comment