Power BI Workflows In Excel with ADSL2

Power BI can become a closed garden for Excel users, but what if we use WorkFlows to expose the data and read it in Excel? Nice in theory but there's no Excel connector. What if we use Azure ADSL2 as the target of our work flow? Again we run into the connector problem. Then I found a solution that wrote to Synapse Serverles SQL Use Synapse On-Demand SQL to read Power BI Dataflow CDM data. – Vesa Tikkanen ,and implemented it. Then I saw the light. Vesa's solution reads the model.json file of ADSL2 and then uses that to read the csv files, all of this in SQL, but why not do the same in PowerQuery? I happened to have some test data from the Planner project in ADSL2.

The model.json contains the information that we need to connect to ,and read, the data it looks like this:

        {
            "$type": "LocalEntity",
            "name": "GroupName",
            "description": "",
            "pbi:refreshPolicy": {
                "$type": "FullRefreshPolicy",
                "location": "GroupName.csv"
            },
            "attributes": [
                {
                    "name": "groupName",
                    "dataType": "string"
                },
                {
                    "name": "groupId",
                    "dataType": "string"
                }
            ],
            "partitions": [
                {
                    "name": "Part001",
                    "refreshTime": "2021-09-08T05:03:04.2068762+00:00",
                    "location": "https://planner9.dfs.core.windows.net/powerbi/PlannerDataFlow/PlannerFlow/GroupName.csv.snapshots/GroupName.csv@snapshot=2021-09-08T05:03:04.1131321Z"
                }
            ]
        }

Excel is not ideal for development, so I started with PowerQuery in PowerBI desktop:
Here is the initial query, note that 'dfs' in the url:

let
    Source = Json.Document(Web.Contents("https://planner9.dfs.core.windows.net/powerbi/PlannerDataFlow/PlannerFlow/model.json")),
    entities = Source[entities],
    #"Converted to Table" = Table.FromList(entities, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"name", "partitions"}, {"name", "partitions"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded Column1", each ([name] = "Bucket")),
    #"Expanded partitions" = Table.ExpandListColumn(#"Filtered Rows", "partitions"),
    #"Expanded partitions1" = Table.ExpandRecordColumn(#"Expanded partitions", "partitions", {"location"}, {"partitions.location"})
in
    #"Expanded partitions1"

That delivers the location of the latest Bucket file:

We also need the column names, as the csv file doesn't have them:

let
    Source = Json.Document(Web.Contents("https://planner9.dfs.core.windows.net/powerbi/PlannerDataFlow/PlannerFlow/model.json")),
    entities = Source[entities],
    #"Converted to Table" = Table.FromList(entities, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"name", "attributes"}, {"name", "attributes"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded Column1", each ([name] = "Bucket")),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"name"}),
    #"Expanded attributes" = Table.ExpandListColumn(#"Removed Columns", "attributes"),
    #"Expanded attributes1" = Table.ExpandRecordColumn(#"Expanded attributes", "attributes", {"name"}, {"name"}),
    AttributeList = Table.ToList(#"Expanded attributes1")
in
    AttributeList

That gives us:

Then we can get some Bucket data:

let
    Source = Json.Document(Web.Contents("https://planner9.dfs.core.windows.net/powerbi/PlannerDataFlow/PlannerFlow/model.json")),
    entities = Source[entities],
    #"Converted to Table" = Table.FromList(entities, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"name", "partitions"}, {"name", "partitions"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded Column1", each ([name] = "Bucket")),
    #"Expanded partitions" = Table.ExpandListColumn(#"Filtered Rows", "partitions"),
    #"Expanded partitions1" = Table.ExpandRecordColumn(#"Expanded partitions", "partitions", {"location"}, {"location"}),
    buckets = Csv.Document(Web.Contents(#"Expanded partitions1"[location]{0}),[Delimiter=",", Columns=3, Encoding=1252, QuoteStyle=QuoteStyle.None])
 
in
    buckets

Which look like this:


Then we rename our columns, using the handy List.Positions and List.Transform functions:

let
    listpositions = List.Positions(columnnames),
    tablecolumns = Table.ColumnNames(buckets),
    rename = List.Transform(listpositions, each {tablecolumns{_},columnnames{_}}),
    bucketsrenamed = Table.RenameColumns(buckets,rename)
in
    bucketsrenamed

This is all looking pretty good, so we create a query, making the table name a parameter pTableName, and use PowerBI to create a Power Query function for us:

let
    Source = (pTableName as any) => let
        Source = Json.Document(Web.Contents("https://planner9.dfs.core.windows.net/powerbi/PlannerDataFlow/PlannerFlow/model.json")),
        entities = Source[entities],
        #"Converted to Table" = Table.FromList(entities, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
        #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"name", "partitions","attributes"}, {"name", "partitions", "attributes"}),
        #"Filtered Rows" = Table.SelectRows(#"Expanded Column1", each ([name] = pTableName)),
        #"Expanded partitions" = Table.ExpandListColumn(#"Filtered Rows", "partitions"),
        #"Expanded partitions1" = Table.ExpandRecordColumn(#"Expanded partitions", "partitions", {"location"}, {"location"}),
        buckets = Csv.Document(Web.Contents(#"Expanded partitions1"[location]{0}),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),
        #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"name","partitions"}),
        #"Expanded attributes" = Table.ExpandListColumn(#"Removed Columns", "attributes"),
        #"Expanded attributes1" = Table.ExpandRecordColumn(#"Expanded attributes", "attributes", {"name"}, {"name"}),
        columnnames = Table.ToList(#"Expanded attributes1"),
        listpositions = List.Positions(columnnames),
        tablecolumns = Table.ColumnNames(buckets),
        rename = List.Transform(listpositions, each {tablecolumns{_},columnnames{_}}),
        bucketsrenamed = Table.RenameColumns(buckets,rename)
    in
        bucketsrenamed
in
    Source

This function is then ready to go into Excel. so we paste it in the function 'fGetTable' and for each table we create a query:

let
    Source = fGetTable("Bucket")
in
    Source

we need to process some of the column types:

let
    Source = fGetTable("Calendar"),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, 
                                                        {"IsWorkingDay", type logical}, 
                                                        {"MonthNumb", Int64.Type}})
in
    #"Changed Type"

then we have this in Excel:



The key points to this story are:

  • Create your ADLS2 in the Azure Portal
  • Map an empty PowerBI workspace to Azure ADLS2
  • Add your WorkFlow and give it a data refresh
  • Use Azure Storage Explorer to find the URL with 'dfs'
  • Prototype your PowerQuery function in PowerBI (optional)
  • Implement your query in Excel
  • This has been tested on desktop Excel
  • This has not been tested with Premium DataFlows

Permissions need to be correct for this to work, you authenticate the connection using your organisational login, and you will need to turn the PowerQuery firewall off.

I'm sure I've missed something, not explained it well or just plain got it wrong, so DM me on Twitter with questions. 

What happens next and how we use the data will be another post that will focus more on Planner in Excel.

 

 

Comments are closed