John Dalesandro

Power Query: Handling Paginated REST API Data

Processing paginated API results where the page count is either known or unknown
Harry Potter New York - Weird Wizarding Dilemmas and their Solutions

Querying a RESTful API, processing its response, and transforming the data with Power Query is a powerful feature. When integrated with Power BI or Excel, Power Query can often simplify the process, eliminating the need for additional tools or scripts for handling JSON responses.

However, working with REST APIs can be challenging, especially with large data sets. To manage processing and bandwidth, APIs often paginate responses, breaking them into multiple pages. For example, if a query is expected to return a million entries, the API might divide the result into pages of 100 entries each. Retrieving the full set would then require ten thousand API requests (100 entries per page * 10,000 pages = 1,000,000 entries).

This post demonstrates Power Query code to process paginated JSON results from a REST API. It provides three solutions for handling scenarios where the number of pages is known, calculated, or unknown.

REST API

The code in this solution uses a publicly available REST API from the U.S. Department of the Treasury called the Treasury Reporting Rates of Exchange. This API provides the U.S. government’s official quarterly exchange rates for converting foreign currencies to U.S. dollars. I chose this API because exchange rates are a familiar concept, though not crucial for this exercise. More importantly, it provides a large amount of data needed to demonstrate pagination. I recommend checking out the full U.S. Treasury Fiscal Data site, which offers various well-documented, free-to-use data sources that don’t require sign-ups or API tokens. It’s a great resource for learning how to interact with APIs and develop analytics with real-world data.

The API query used here is simple. It requests three data fields from the Rates of Exchange API: Country – Currency Description, Exchange Rate, and Record Date. The JSON response includes metadata like count, total-count, and total-pages. These fields are key for verifying that the Power Query code works correctly. The count field shows the number of entries returned, the total-count field shows the total number of entries, and the total-pages field shows the number of pages of results, such as 100 results per page across 162 pages in total (with the 162nd page containing only 10 entries).

Custom Power Query Paging Function

According to the API documentation, there are two parameters for pagination. The page[size] parameter controls the number of entries per page, with the default set to 100 entries. This example uses the default setting. The more crucial parameter here is page[number], which specifies the page index to retrieve a specific page from the data set. For example, if there are 42 total pages and page[number] is set to 7, the response will return the 100 entries on page 7.

GetPage Function

We need to create a custom Power Query function to handle a dynamic number of pages returned by an API. This function has one parameter, page, which represents the page number to retrieve. This function is used in all three solutions.

  1. In Power BI, Excel, or any other application that supports Power Query, select Get Data and then Blank Query.
  2. The Power Query window opens with a new blank query.
  3. From the Power Query Home ribbon, select Advanced Editor.
  4. Paste the following code into the Advanced Editor, replacing any existing code.
  5. Click Done to close the Advanced Editor.
  6. In the Query Settings pane under Properties, rename the function to GetPage.
(page as text) =>
let
    Source = try Json.Document(Web.Contents("https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/rates_of_exchange?page[number]=" & page & "&fields=country_currency_desc,exchange_rate,record_date")) otherwise [data=null, meta=null, links=null],
    #"Converted to Table" = Table.FromRecords({Source}),
    #"Removed Columns" = Table.RemoveColumns(#"Converted to Table",{"meta", "links"}),
    #"Expanded data List" = Table.ExpandListColumn(#"Removed Columns", "data"),
    #"Expanded data Records" = Table.ExpandRecordColumn(#"Expanded data List", "data", {"country_currency_desc", "exchange_rate", "record_date"}, {"data.country_currency_desc", "data.exchange_rate", "data.record_date"})
in
    #"Expanded data Records"

Retrieving All Pages

This API conveniently provides the total number of pages in its JSON response, making page retrieval straightforward. However, many other APIs do not offer this information. Some only return the total number of entries, while others don’t provide any counts. For these less accommodating APIs, I’ve included alternative methods to calculate or retrieve pages dynamically without meta counts.

To keep things simple, we will use the same API for each scenario and ignore the number of entries and pages in order to demonstrate.

  1. From the Power Query Home ribbon, select New Source and then Blank Query.
  2. The Power Query window opens with a new blank query.
  3. From the Power Query Home ribbon, select Advanced Editor.
  4. Paste one of the following code sequences into the Advanced Editor (appropriate to your specific scenario), replacing any existing code.
  5. Click Done to close the Advanced Editor.
  6. In the Query Settings pane under Properties, rename the function to Treasury Reporting Rates of Exchange.

Method 1: Page Count is Known

In the easiest scenario, the API response includes a total page count. No additional calculation is needed.

let
    Source = try Json.Document(Web.Contents("https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/rates_of_exchange?fields=country_currency_desc,exchange_rate,record_date")) otherwise [data=null, meta=null, links=null],
    #"Page List" = if Source = null or Source[meta] = null then {null} else {1..Source[meta][#"total-pages"]},
    #"Converted to Table" = Table.FromList(#"Page List", Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "PAGES"}}),
    #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"PAGES", type text}}),
    #"Invoked Custom Function" = Table.AddColumn(#"Changed Type", "GetPage", each if [PAGES] <> null then GetPage([PAGES]) else null),
    #"Expanded GetPage" = Table.ExpandTableColumn(#"Invoked Custom Function", "GetPage", {"data.country_currency_desc", "data.exchange_rate", "data.record_date"}, {"GetPage.data.country_currency_desc", "GetPage.data.exchange_rate", "GetPage.data.record_date"})
in
    #"Expanded GetPage"

Method 2: Page Count is Unknown; Entry Count is Known

If the number of pages isn’t provided by the API, it can be calculated using available data. If the API returns the total number of entries and the number of entries per page is known, you can calculate the number of pages by dividing the total number of entries by the number of entries per page and rounding up.

The main difference from method 1 is in the Page List step. This step generates a dynamic list using {1..Number.RoundUp(Source[meta][#"total-count"] / Source[meta][#"count"])}. This calculates the number of pages by dividing the total-count (total entries) by count (entries per page) and rounding up to ensure all entries are included. If there are any entries left over after dividing, they will be included in an additional partial page.

For example, with 205 total entries and 100 entries per page, there would be 3 pages to include the extra 5 entries beyond 200. Adjust the calculation if the API pagination starts at zero or a different number.

let
    Source = try Json.Document(Web.Contents("https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/rates_of_exchange?fields=country_currency_desc,exchange_rate,record_date")) otherwise [data=null, meta=null, links=null],
    #"Page List" = if Source = null or Source[meta] = null then {null} else {1..Number.RoundUp(Source[meta][#"total-count"] / Source[meta][#"count"])},
    #"Converted to Table" = Table.FromList(#"Page List", Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "PAGES"}}),
    #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"PAGES", type text}}),
    #"Invoked Custom Function" = Table.AddColumn(#"Changed Type", "GetPage", each if [PAGES] <> null then GetPage([PAGES]) else null),
    #"Expanded GetPage" = Table.ExpandTableColumn(#"Invoked Custom Function", "GetPage", {"data.country_currency_desc", "data.exchange_rate", "data.record_date"}, {"GetPage.data.country_currency_desc", "GetPage.data.exchange_rate", "GetPage.data.record_date"})
in
    #"Expanded GetPage"

Method 3: Both Page Count and Entry Count are Unknown

In this more complex scenario, neither the total number of pages nor the total number of entries is known because the API does not provide any metadata on result counts.

The List.Generate function is used to iterate through pages until a null page is returned, which indicates that the page contains no entries. This requires the custom GetPage function to return null when a page with no results is accessed. The condition for detecting a null page should be tailored to the specific API being used.

let
    Source = List.Generate(() => [PAGE = 1, RESULTS = GetPage("1")], each Table.RowCount(Table.SelectRows(_[RESULTS], each not List.Contains(Record.ToList(_), null))) > 0, each [PAGE = _[PAGE] + 1, RESULTS = GetPage(Number.ToText([PAGE] + 1))], each _[[PAGE], [RESULTS]]),
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"PAGE", "RESULTS"}, {"PAGES", "GetPage"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Column",{{"PAGES", type text}}),
    #"Expanded GetPage" = Table.ExpandTableColumn(#"Changed Type", "GetPage", {"data.country_currency_desc", "data.exchange_rate", "data.record_date"}, {"GetPage.data.country_currency_desc", "GetPage.data.exchange_rate", "GetPage.data.record_date"})
in
    #"Expanded GetPage"

Summary

The post explains how to use Power Query to handle paginated REST API responses, focusing on three scenarios: when the page count is known, when only the entry count is known, and when both counts are unknown. A custom function is created to retrieve individual pages for dynamic pagination and solutions include handling known page counts directly, calculating pages from entry counts, and iterating through pages until no additional data is returned.