Archive for March, 2021
Wednesday past (March 24th 2021), I had the wonderful pleasure of appearing as a co-host on the bi-weekly PowerBI Quiz by Just Blindbaek, a Danish fellow Data Platform MVP. We’re nearing the end of Season 3, and though I haven’t been able to chime in every time, or score the way I wanted score, I still have a blast every time I play this. There’s a large number of returning faces every time, and we keep each other on our toes, and have tons of fun whilst doing so.
When Just asked me a few weeks back if I wanted to host a topic, and I immediately jumped to this opportunity. I’ve been doing some work around APIs and PowerShell, which was a topic I dreaded before. Hence, this was an excellent topic to test the knowledge of the Power BI Quiz participants. All questions are aimed at a governance perspective, and how the APIs/PowerShell can be a useful tool for helping things stay afloat.
Whilst making these questions, I also realised it would be an excellent blog post to explain the questions and answers in a blog post afterwards, to have a better explanation of my reasoning. As well, I learned it’s harder than expected to create questions that fall in the not too hard, not too easy sweet spot for a quiz. This makes me respect the other quiz makers even more!
The recording of the Power BI Quiz can be found below, or you navigate through the Power BI Quiz website
The Questions and Answers
1. The Power BI Activity Log can return data for the last ..
- A – 90 days
- B – 30 days
- C – 45 days
- D – 60 days
Explanation: The Power BI Activity Log will only return day for the last 30 days, which is different than the Office 365 Audit Logs through the Security & Compliance Centre, which will return the last 90 days. You’ll have to “sacrifice” those extra 60 days of possible history, but in return you get a more stable API (in my opinion), and less required privileges to extract that data.
Meaning, you only need to have permissions on the Power BI side (Power BI Admin Role specifically), and no roles in the Security & Compliance Center of Office 365 (At least viewer role). I have had some organisations where this was a definite no-go, as this would mean they could extract the logs for all Office 365 components.
2. The Export to File API will work with workspaces using
- A – An A1 Sku (or higher)
- B – Power BI Pro
- C – Power BI Free
- D – A P1 Sku (or higher)
Answer: A, D
Explanation: The Export to File API will export your Power BI Reports or Paginated Reports to file format of your choice. Based on the capacity you have backing up these requests, you’ll get a higher concurrency rate for processing these reports. In the background, it’s essentially doing the print/save as options you can manually, but more optimised for bulk usage.
Some things to note .. Right now, you can technically get it to work with a PPU license, but throttling will hit you fast ‘n hard. If this behaviour will persist after General Availability on April 2nd, I can’t tell. An A1 Sku (or higher) is intended for external (outside of your own organisation) embedding scenarios, and can help you get started with this scenario really quickly.
3. Power BI dataflows can be created/copied through API calls
- A – TRUE
- B – FALSE
Explanation: Yes, it’s possible! But it’s not easy 😃. This process is based on the Import APIs, which lets you upload .pbix files for a report, or the .json files for dataflows. While this sounds easy enough, there’s a few hoops you have to jump through, which I never got working on my own. Luckily, a few community members have posted working options and scripts to do this, and now there’s even an External Tool for it.
4. To determine if a .pbix file uses DirectQuery through APIs, I can
- A – use Report Information
- B – use Dataset Information
- C – use Datasource Information
- D – use DirectQuery RefreshSchedule
Answer: B, C, D
When using the Dataset Information (GetGroupsAsAdmin, with a $expand on datasets), it will return a field called ‘ContentProviderType’, which displays the connection mode for the .pbix file. For those files that run DirectQuery, you’ll see a value ‘PbixInDirectQueryMode’.
When using the Datasource Information (GetDataSourcesAsAdmin), there’s 2 fields called Name and ConnectionString. Based on the tests I did on my environments, they only returned values when DirectQuery is being used. Would I trust this completely to base myself on for all scenarios? Definitely not, but it’s something!
When using the DirectQuery RefreshSchedule, you can call this API for every dataset. The ones that actually return a schedule are either DirectQuery or LiveConnection. Based on the DataSourceType (pretty much exclude Power BI / Analysis Services), the returned records are datasets which have the default behaviour for caching enabled. Since you have to loop over all your datasets individually, you have a high risk of running into limits on this one, and an additional chance that someone disabled the caching behaviour.
To conclude this, the most durable solution I found there was to use the Dataset Information, but this has not yet been tested with Composite Models v1, or Direct Query over Azure Analysis Services and Power BI Datasets.
5. “GetScanResult” is the only API that returns Dataset endorsements
- A – TRUE
- B – FALSE
Explanation: At the time of writing (and airing), this is definitely the case! GetScanResult is part of the asynchronous mechanism to incrementally fetch all the lineage information on your Power BI Tenant. Essentially it’s the same that’s happening under the covers of Azure Purview, and this allows you to build your own solution.
To successfully do this, you need to handle the calls for modified workspaces, poll for the request state, and then get the results back. If you’re fairly new to APIs & PowerShell, this is definitely daunting. Luckily, there’s community resources out the there to help us get started.
6. Using a Service Principal (SPN) I can use these read-only Admin APIs
- A – GetDatasetsAsAdmin (Datasets)
- B – AddUserAsAdmin (Groups)
- C – GetDatasourcesAsAdmin (Datasets)
- D – GetCapacitiesAsAdmin (Capacities)
Answer: C, D
Explanation: Read-only Admin APIs and Service Principal authentication were only announced in December 2020, but they made a really big difference for the type of things we’re doing. They allow us to use an Azure AD App Registration, and extract the metadata details on our Power BI Tenant we need, fully unattended! Right now, there’s only an initial batch of APIs that are allowed for SPN authentication, but I expect more to arrive over time.
The only thing I find a bit wonky right now is that we can call GetDataSourcesAsAdmin, which requires a DatasetID. Yet we can not call GetDatasetsAsAdmin to help us get started. Meaning, if we want to iterate over datasets unattended, we have to call a different API (such as GetGroupsAsAdmin, $expand datasets, or GetDatasets) to help us get started. Using the GetCapacitiesAsAdmin, we can return all the info we need on the capacities we have in our organisation.
AddUsersAsAdmin is not allowed, which makes sense as it’s technically not a read-only Admin API.
7. Using “GetGroupsAsAdmin” I can use the $Expand parameter for
- A – datasets
- B – datasources
- C – users
- D – apps
Answer: A, C
Explanation: GetGroupsAsAdmin is my bread and butter for getting most details out of my Power BI Tenant, especially in combination with $Expand. Before this existed, we had to individually loop over all the object (nested calls most of the time) to return the required results. For larger tenants, this meant bumping into API Limits all .. the .. time .. The GetGroupsAsAdmin API is called once for every workspace, and will return all the specified information in a single go. There’s a limit for 5000 workspaces per call, but you can work your way around this by intelligently looping over this.
The $expand parameter can be used for users, reports, dashboards, datasets, dataflows, workbooks. Meaning datasources and apps are not a part of this.
8. I can use the (Datasets) UpdateDatasources API to change a parametrized Connection String
- A – TRUE
- B – FALSE
Explanation: We can use API calls to alter connection strings, for instance to change between DEV/PRD environments, or perform a migration. One of those options is using the UpdateDatasources API, to do so. When using DirectQuery datasources, I’ve had some real issues when using a parametrized ConnectionString, as it would not allow it. When reading the Docs page, it actually does outline this a restriction, and tells us to use the UpdateParameters API to change the parameters, and not the connection string. In full transparency, I’d look at Rebind Report operations, or External Tools to assist in this process, as I had some kinks to work out when doing this.
9. To safely unassign a Workspace from a Capacity using APIs I can
- A – Delete the workspace, it can not be unassigned through the API
- B – Use “CapacityAssignmentStatus” with -UnAssign and Workspace GUID
- C – Use “UnassignFromCapacity” with the workspace GUID
- D – Use “AssignToCapacity” with an empty GUID (0000..00) for CapacityID
Explanation: Off the bat, let me emphasise on the word “safely” in the question. When deleting a workspace it’s no longer assigned to the workspace, but we also lost the content that was included in this..
The only real solution to this is using the AssignToCapacity API, with an empty GUID (00000000-0000-0000-0000-000000000000), and thus assigning to a default capacity (the shared one for Power BI Pro). CapacityAssignmentStatus is an actual API, but it’s only used for getting the status reports on certain workspaces, when making the switch.
UnassignFromCapacity does not exist as an official API, and is something I completely made up 😃
10. To inventarise who CAN use PPU (Premium Per User) features, I have to:
- A – Check the Power BI Activity Log
- B – Refer to the PPU Tab in the Power BI Admin Portal
- C – Grab licensing information (ie. through Graph API)
- D – Check Power BI Workspace information
Answer: A, C, D
Explanation: First off, I have to explain ‘inventarise’, as it caused some confusion during the quiz. My meaning for this is the assemble a list of users who have the licenses and workspace access to perform PPU activities. To get the license information, we have to look at 2 different sources. The easiest one is the licenses that were assigned by a License or Power BI Admin, and these be extracted through the Graph API for instance. The hardest one is the In-Product Trial Experience, where users opt-in for the paid trial and get a free 60 day access pass to take it for a spin. The only way we can extract that information is a part of the Power BI Activity Log (or Office 365 Audit Log). Specifically we need to look for the ‘OptInForProTrial’, and ‘OptInForExtendedProTrial’ actions.
Then, we need to look at workspaces that are assigned to the Premium Per User reserved capacity, which we can do with the GetGroupsAsAdmin API, preferred with the $expand on users, to return the access list for the workspace. Luckily, I made a write-up on this process a while back, that holds some information on what you can do to prevent it, and steps to figure out who’s doing it.
One remark, which was stressed by Sir Saxton of Cubes as well, is the essential purpose of storing the Activity Logs as soon as you can. Due to the limited retention (30 days) you can not go back in time to when these licenses were made available. This is just one of the use cases where the Activity Log comes to the rescue, and I run into different uses on a daily basis. If there’s anything you should learn from this, it’s to get that extraction and retention up and running, ASAP!
I had lots of preparing the questions, and co-hosting the quiz with Just. I’m hoping people have learned something new on this, and that they’ll dabble into some of these things themselves as well. But, why not join us for the real deal next time?
Next time we’re doing the Power BI Quiz is on Wednesday April 7th at 8PM UTC+1, with the co-host to be announced
Subscribe to Just’s YouTube channel, and you’ll get the notifications for it as well