Some datasets contain information that is only relevant between two dates. This could be anything from contracts with a start and end date, to marketing campaigns that run during certain periods. Often, you’ll need to identify which contracts or campaigns are active at a time for a given transaction. Yet, while Power Query has a range of join types available, it lacks a built-in feature for joining data based on a date range. This article shows how you merge queries based on a date range by building your own custom join logic.
Table of contents
Introduction
Consider a scenario where hotel chains run promotional campaigns offering rooms at discounted rates. These campaigns have start and end dates defining their active periods. Analyzing the performance of the business requires matching transactions with active campaigns. A typical Campaign table can then look as follows:
This table contains the Campaign name, the Start Date, and End Date. For your analysis, it’s useful to see how the campaigns perform, so you want to relate the transactions to the relevant campaigns.
Suppose you’re working with the following Transaction table:
These 10 rows concern transactions, where you can find the ID, the Date, Description of the purchased room and the Amount the customer paid for the room.
You want to expand this dataset by including a column that identifies the campaign active at the time of each transaction. It’s important to note that, for the purpose of this task, we’ll assume that only one campaign can be active on any given date.
So how can you join on a date range? There are two obvious strategies to achieve this:
- Expanding the campaign table to create a distinct row for every day a campaign is active, effectively duplicating the campaign information across multiple dates.
- Creating a custom join that effectively links the campaign data with the transaction records based on the active date range.
Let’s see how both of these approaches work.
Method 1: Expanding the Campaign Table
The first method focuses on changing the Campaign table. The original table contains a unique row per campaign with a Start Date and an End Date. We will adjust this table so it shows a unique row for each date a campaign is active.
Creating a list of campaign dates
An effective way to do this is by creating a new custom column and using the List.Dates function. This column will hold a list of all active campaign dates. You can use the following code:
Table.AddColumn (
#"Changed Type",
"Date",
each List.Dates (
[Start Date],
Duration.Days ( [End Date] - [Start Date] ) + 1,
#duration ( 1, 0, 0, 0 )
),
type {date}
)
In the above code we create a new column called Date. The List.Dates function:
- takes a Start Date to generate a date range.
- calculates the number of days required for the list of dates by using Duration.Days.
- increments each step with a single day by using the #duration function.
- ascribes a data type of a list of dates.
The result of this operation is the below table:
As the image shows, you now have a new column that contains a list of dates for when each campaign is active.
Expanding the list of dates
Once you got this table, you can expand the list column by clicking the two opposing arrows in the column header, and selecting Expand to New Rows.
This operation results in a table that has a unique date for each day a campaign is active. The below image shows a sample of different rows of the resulting table:
As you can see, the Date column contains all dates between the start and end date of each campaign.
Joining the date columns
With this table in place, you can now navigate to the Transactions table. Next, perform a left-outer join operation between the Date column in the Transactions table and the Date column in the Campaign table.
To try this out yourself, you can make use of the following M code. You can create a query called Campaigns1 with the following definition:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VY7LCoMwFER/5ZK1gomt+0ofyxYtdCEuLvFagjFKEi3+fcVCm8JsDsxhpqpYSegGgxou5PGFC5QjSYXasYiJRKRxwtcEwPesjir2IOrINFCQt4Qerm1LNpR4FoDYbdIZe6UXODQzGT9ZghvKDp/k/qs/SPnmFUOPxisJJydxJDgS6m9LhP9W+PzLJ6cMOQd3izNpyAczuWBIxDwJIWN1/QY=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Campaign Name" = _t, #"Start Date" = _t, #"End Date" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Start Date", type date}, {"End Date", type date}}),
AddListOfDates = Table.AddColumn(#"Changed Type", "Date", each
List.Dates( [Start Date], Duration.Days( [End Date] -[Start Date] ) + 1, #duration(1,0,0,0) ),
type {date} ),
#"Expanded Custom" = Table.ExpandListColumn(AddListOfDates, "Date"),
#"Removed Other Columns" = Table.SelectColumns(#"Expanded Custom",{"Campaign Name", "Date"})
in
#"Removed Other Columns"
Remember to call the above query Campaigns1. Next, create a new query, call it Transactions1 and add the following code to the advanced editor:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bdC/CoMwEAbwV5HMGnKX5DwfoFO79M8giFMJJUN10baP3xhpiFLI8sGPy3fXdQJEKVChrhRUikI4z84NxWUcnyEAsWys6MtO4B7e3j45zSQ1RKdzZ0M4fNx9nvzLFdfZT27RhJLWqSbTuHRp21Nx9MMjTbYkebU2WQwvhI0zqSllDmDftAbJJro6b6r3q2sAWVOEnEP8uxIa+ztAk+n4/XasaqTiCEHlC/FeIgcZCvRf", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Date = _t, Description = _t, Amount = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Amount", type number},{"Date", type date}}, "en-US" ),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"Date"}, Campaigns1, {"Date"}, "Campaigns1", JoinKind.LeftOuter),
#"Expanded CampaignsExploded" = Table.ExpandTableColumn(#"Merged Queries", "Campaigns1", {"Campaign Name"}, {"Campaign Name"})
in
#"Expanded CampaignsExploded"
Now let’s see another method for creating a date range join.
Method 2: Creating a Date Range Join
While Power Query supports a variety of join operations, it lacks a built-in option for joining data based on date ranges. In this section we look at manually constructing such a join.
Again we start with the campaign and transaction table as shown in the introduction. We want to merge the campaign description with our transaction records, based on the active dates of each campaign. Let’s see how we can do this.
Adding Campaigns table to new column
We start by adding a new column to our Transactions table, which will serve as a bridge to our campaigns data. We begin with the following code snippet:
Table.AddColumn(#"Changed Type", "CampaignJoin", each Campaigns )
This line of code uses the Table.AddColumn function to add a copy of the entire Campaigns table into every cell of our newly created column. This is a great starting position.
Filtering the campaigns table
The next step involves filtering down the relevant campaigns for each transaction based on the date range. Our Transactions table includes a ‘Date‘ column, which we’ll use for this purpose.
We want to filter the content of our newly added Campaigns column so that only those campaigns whose Start Date is on or before the transaction Date and whose End Date is on or after the transaction Date are considered relevant.
A first attempt could be:
Table.AddColumn(#"Changed Type", "CampaignJoin", each
Table.SelectRows(
Campaigns,
each [Start Date] <= [Date] and [End Date] >= [Date]
)
)
Unfortunately, this code won’t work. You will receive the error: “The field ‘Date’ of the record wasn’t found”. The reason for this is that the Table.SelectRows function tries to look for the Date column within the Campaigns table. And since this table does not have a column named Date, the error appears.
The challenge in this scenario is that we want to refer to values in different scopes. There’s the inner scope of the Campaigns table within Table.SelectRows, and the outer scope of the Transactions table within the Table.AddColumn function.
Tip
The each construct is used in code as syntax sugar for a function with a single parameter. Instead of each you could also write (_) =>
. In this code we find a single parameter (the underscore) that is passed to the function body.
Let’s understand what the issue is by replacing each for a function definition. In the below example, I have replaced the ‘each’ construct with an equivalent that manually defines a function.
At the same time I added the underscore before each of the field references, to illustrate that this reference picks a field from the table represented by the underscore.
Table.AddColumn(#"Changed Type", "CampaignJoin", (_) =>
Table.SelectRows(
Campaigns,
(_) => _[Start Date] <= _[Date] and _[End Date] >= _[Date]
)
)
So what is the issue here? References to the inner scope (Campaigns) and the outer scope (Transactions) use the same parameter, the underscore.
Use distinct parameters for inner and outer scope
So what can we do to resolve this? The key lies in defining a unique parameter for each scope. If we swap those around, we can successfully filter our tables. Here’s the adjusted code:
Table.AddColumn(#"Changed Type", "CampaignJoin", (T) =>
Table.SelectRows(
Campaigns,
(C) => C[Start Date] <= T[Date] and C[End Date] >= T[Date]
)
)
This code works by referencing the Transactions table (T) and the campaigns table (C) using separate parameters. After implementing this change, we can apply our filters.
Each row then contains a table object filtered down to the relevant rows. The only thing left for us to do is to select the campaign name.
Error handling and buffering
Another important consideration is that there may also be rows without a match in the campaigns table, for instance on February 8th 2024. We should therefore select the first value in the Campaign Name column and add some error handling by making the selection optional using the ‘?‘ operator.
Lastly, since we are filtering the campaigns table repeatedly it’s important to buffer it in memory. That allows the operation to be more memory efficient. You can do that by passing the table to the Table.Buffer function.
By adding these changes and some variables with a let-statement, our final code then looks as follows:
Table.AddColumn(#"Changed Type", "CampaignJoin", (T) =>
let
CampaignsTable = Table.Buffer( Campaigns ),
filterTable =
Table.SelectRows(
CampaignsTable,
(C) => C[Start Date] <= T[Date] and C[End Date] >= T[Date]
) ,
selectValue = filterTable[Campaign Name]{0}?
in
selectValue,
type text
)
The result of this join is the following table:
The result is a range based join that returns null when it can’t find a match. It’s important to note that our original campaigns table should use valid dates. In case of overlapping values, you may not get the expected results.
If you want to try this for yourself, you can make use of the below code. For the campaigns table, use this code and name the query Campaigns2:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VY7LCoMwFER/5ZK1gomt+0ofyxYtdCEuLvFagjFKEi3+fcVCm8JsDsxhpqpYSegGgxou5PGFC5QjSYXasYiJRKRxwtcEwPesjir2IOrINFCQt4Qerm1LNpR4FoDYbdIZe6UXODQzGT9ZghvKDp/k/qs/SPnmFUOPxisJJydxJDgS6m9LhP9W+PzLJ6cMOQd3izNpyAczuWBIxDwJIWN1/QY=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Campaign Name" = _t, #"Start Date" = _t, #"End Date" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Start Date", type date}, {"End Date", type date}})
in
#"Changed Type"
After adding the campaigns table, you can paste below code for the Transactions2 query:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bdC/CoMwEAbwV5HMGnKX5DwfoFO79M8giFMJJUN10baP3xhpiFLI8sGPy3fXdQJEKVChrhRUikI4z84NxWUcnyEAsWys6MtO4B7e3j45zSQ1RKdzZ0M4fNx9nvzLFdfZT27RhJLWqSbTuHRp21Nx9MMjTbYkebU2WQwvhI0zqSllDmDftAbJJro6b6r3q2sAWVOEnEP8uxIa+ztAk+n4/XasaqTiCEHlC/FeIgcZCvRf", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Date = _t, Description = _t, Amount = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Amount", type number},{"Date", type date}}, "en-US" ),
#"Range Join" = Table.AddColumn(#"Changed Type", "CampaignJoin", (x) =>
let
filterRows = Table.SelectRows( Campaigns2, each [Start Date] <= x[Date] and [End Date] >= x[Date] ) ,
selectValue = filterRows[Campaign Name]{0}?
in selectValue,
type text )
in
#"Range Join"
Conclusion
And that’s it, we achieved our goal. You learned about scope, filtering down a table manually to replicate a join operation and buffering operations to improve performance. By writing a bit of custom M code, we created a join on a start and end date range.
Enjoy Power Query!
Recommended Reading…
Power Query Foundations
Beginners Guide to Power Query
List Functions (200+ Examples)
Text Functions (150+ Examples)
Creating Tables (40+ Examples)
Generate the Ultimate Date Table
Advanced Topics
Master List.Generate with Easy Examples
Deep Dive into List.Accumulate
applied the method 2 and its working correctly in power query, but once i loaded it to power bi it takes time to load. Did i do something wrong or my data is just to big?
Hello Rick, thanks again for this great topic. The one you quicly stumble upon when you deal with retail business data like price lists that change over time, for instance.
With your campaign example, we could fine tune it to deal with overlapping periods of time. The selectValue variable could be
instead of
(thanks for the ‘?’ tip). It works fine with overlapping campaigns, which is the case on the 24th of January. And it still works with no campaign dates.
Hope it will help.
Have a great day!
Marc
Hey Marc –
That’s a great trick to use, very effective for returning multiple text values. I’ve used it before when grouping values too.
In the end, the solution is only as good as the requirements. In the cases I worked with, there never was supposed to be an overlapping period. But if there is, I think you have a great way to work with this logic. Thanks for sharing!
This is exactly the problem I’ve been working on last week – thank you. I ended up with an SQL solution, but this is more elegant and I’d like to test if it’s quicker.
Your book arrived today – I’m excited to have a read.
Great to hear that. Hope you enjoy the book!
Thank you very much for sharing. Just this week I was able to solve a real problem thanks to this article. In my case, I not only had to combine with dates, but I also had a range of weights. In other words, I had more than one condition that had to be fulfilled.
This article has been my salvation… I repeat thank you very much.
With multiple conditions you can keep building on this logic. Sounds like you got it to work correctly, glad to hear that!
Rick