Posts Tagged Power BI
It’s T-SQL Tuesday!!
“For this month’s T-SQL Tuesday, I want you to write about how you have used, or would like to use, a Jupyter notebook. This seemed to be exciting for many people at first, but I haven’t seen a lot of uptake from users in general. So I’m curious if you are using them.”
The original post is here.
Jupyter Notebooks and me
In my write-up for T-SQL Tuesday #135, I wrote the following:
“After seeing some folks like Julie Koesmarno (twitter), Aaron Nelson (twitter) and Rob Sewell (twitter) show off Notebooks in Azure Data Studio, I’m keen in rewriting some of my scripts and processes into a fancy Jupyter Notebook. I’m convinced this will really help me in some of the assessments I do, and easily share my work and results. Other than that, it’s fancy toys I want to try out for myself.”
By now, I actually did start on reforming some of my scripts into Jupyter Notebooks, with the emphasis on using them in an explanatory context. Maybe even use it an x step plan to execute certain tasks that can not be automated against the APIs. Maybe even turn them into guidebooks for clients to start working on some processes of their own.
I have to admit, I’m liking them a lot so far. My plan is to make this into a series of blog posts, and a community session to present. But, I’m having issues finding time and motivation to stride forward on this.
What kept me from actually getting started with this was the initial step and learning curve I had to take to rewrite my scripts (on top of my natural talent for procrastination). But then, we had Aaron Nelson (twitter) over for a user group session at dataMinds.be, on how he uses PowerShell (and Jupyter Notebooks) to aid in some of the processes. I voiced my intent on getting started, and Aaron graciously replied to me how easy it actually is to rewrite an existing script into a Jupyter Notebook running PowerShell. The only thing I ran into, is that I had some issues with the way I use comments, and how they’re transformed into Markdown. But, nothing I wasn’t able to solve easily with the help of my trusty friend Mr. Bing.
I stand corrected. Apparently the illustrious @jamesoneill has simplified the situation 😃
Doug pointed out that it's now as easy this:
dir *.ps1 | ConvertTo-PowerShellNoteBook
— Aaron Nelson #PowerBI 👨💻 👨🔬 📊 (@SQLvariant) January 22, 2021
Extracting the Power BI Activity Log with a Jupyter Notebook
Right now, the only Jupyter Notebook I have ready destined for public eyes, is one to get started with extracting the Power BI Activity Log, and storing them into a .csv file on a location of your choice. I’m convinced every Power BI Admin should set up extraction of the Activity Log as soon as possible, to be able to backtrack to historical usage and analysis. This log only returns data for the past 30 days, so it’s not all that long.
Why did I opt for PowerShell and .csv files, when there’s other solutions out there like using Azure Data Factory as Just Blindbaek (blog post) did? I have a mix of clients scheduling tasks on-premises, or through Azure Automation with Azure Functions. The solution I came up with, had to be easily changeable in between the different environments without having to rewrite the entire thing.
This scenario is written in 2 flavours, being a regular and simple version. The simple version doesn’t have any of the frills added to it, and is purely destined for explaining the happy path that can occur with this. The version I use myself is the regular one, and this comes with some logging and simple error handling, to counter some of the scenarios I ran into when testing.
You can find the notebooks on my GitHub repo, and can easily get started with this. All you need is a user that has Power BI Service Admin privileges, and you’re good to go!
What notebooks do I have coming up?
The notebook I’m currently rewriting is to bulid up a Power BI Metadata inventory of your most commonly used artefacts, through the Read-only Admin APIs. In addition, I’m working on assembling a lineage view of your artefact through the scanning APIs that were released when the Azure Purview solution came into preview.
I have some more things brewing for on-premises Enterprise Data Gateways, and using the Graph API for licenses, group memberships, .. Finding the time for those is something I’m really struggling with right now.
Is there anything else brewing for Power BI and Jupyter Notebooks?
The Power Platform 2020 Release Wave 2 stated Embedding of Power BI Reports in Jupyter Notebooks was coming to GA in March 2021. I think it got moved back, but I’m hoping to see it come to life soon. Purely for Data Analyst purposes, there’s some cool things that can be done with this 😃.
“Scientists, journalists, researchers, developers, and other Jupyter notebook users will be able to embed Power BI reports, and then use Power BI features to improve their data storytelling. Once embedded, adding Power BI bookmarks will make it possible to show different views of the same report and order those views to facilitate data storytelling.”
This T-SQL Tuesday was exactly the kick I needed to start finalising some of the things that had been brewing for a longer time, and came at exactly the right time! I’m absolutely convinced Jupyter Notebooks have a place for explaining some of the more practical things in Power BI Governance, because of the capabilities of adding virtually any object in Markdown. The initial learning curve I feared, wasn’t that hard at all, and it actually made me realise these things should be used more often.
That said, I hope you enjoyed my ramblings.
Stay safe, take care!
Wednesday past (March 24th 2021), I had the wonderful pleasure of appearing as a co-host on the bi-weekly PowerBI Quiz by Just Blindbaek, a Danish fellow Data Platform MVP. We’re nearing the end of Season 3, and though I haven’t been able to chime in every time, or score the way I wanted score, I still have a blast every time I play this. There’s a large number of returning faces every time, and we keep each other on our toes, and have tons of fun whilst doing so.
When Just asked me a few weeks back if I wanted to host a topic, and I immediately jumped to this opportunity. I’ve been doing some work around APIs and PowerShell, which was a topic I dreaded before. Hence, this was an excellent topic to test the knowledge of the Power BI Quiz participants. All questions are aimed at a governance perspective, and how the APIs/PowerShell can be a useful tool for helping things stay afloat.
Whilst making these questions, I also realised it would be an excellent blog post to explain the questions and answers in a blog post afterwards, to have a better explanation of my reasoning. As well, I learned it’s harder than expected to create questions that fall in the not too hard, not too easy sweet spot for a quiz. This makes me respect the other quiz makers even more!
The recording of the Power BI Quiz can be found below, or you navigate through the Power BI Quiz website
The Questions and Answers
1. The Power BI Activity Log can return data for the last ..
- A – 90 days
- B – 30 days
- C – 45 days
- D – 60 days
Explanation: The Power BI Activity Log will only return day for the last 30 days, which is different than the Office 365 Audit Logs through the Security & Compliance Centre, which will return the last 90 days. You’ll have to “sacrifice” those extra 60 days of possible history, but in return you get a more stable API (in my opinion), and less required privileges to extract that data.
Meaning, you only need to have permissions on the Power BI side (Power BI Admin Role specifically), and no roles in the Security & Compliance Center of Office 365 (At least viewer role). I have had some organisations where this was a definite no-go, as this would mean they could extract the logs for all Office 365 components.
2. The Export to File API will work with workspaces using
- A – An A1 Sku (or higher)
- B – Power BI Pro
- C – Power BI Free
- D – A P1 Sku (or higher)
Answer: A, D
Explanation: The Export to File API will export your Power BI Reports or Paginated Reports to file format of your choice. Based on the capacity you have backing up these requests, you’ll get a higher concurrency rate for processing these reports. In the background, it’s essentially doing the print/save as options you can manually, but more optimised for bulk usage.
Some things to note .. Right now, you can technically get it to work with a PPU license, but throttling will hit you fast ‘n hard. If this behaviour will persist after General Availability on April 2nd, I can’t tell. An A1 Sku (or higher) is intended for external (outside of your own organisation) embedding scenarios, and can help you get started with this scenario really quickly.
3. Power BI dataflows can be created/copied through API calls
- A – TRUE
- B – FALSE
Explanation: Yes, it’s possible! But it’s not easy 😃. This process is based on the Import APIs, which lets you upload .pbix files for a report, or the .json files for dataflows. While this sounds easy enough, there’s a few hoops you have to jump through, which I never got working on my own. Luckily, a few community members have posted working options and scripts to do this, and now there’s even an External Tool for it.
4. To determine if a .pbix file uses DirectQuery through APIs, I can
- A – use Report Information
- B – use Dataset Information
- C – use Datasource Information
- D – use DirectQuery RefreshSchedule
Answer: B, C, D
When using the Dataset Information (GetGroupsAsAdmin, with a $expand on datasets), it will return a field called ‘ContentProviderType’, which displays the connection mode for the .pbix file. For those files that run DirectQuery, you’ll see a value ‘PbixInDirectQueryMode’.
When using the Datasource Information (GetDataSourcesAsAdmin), there’s 2 fields called Name and ConnectionString. Based on the tests I did on my environments, they only returned values when DirectQuery is being used. Would I trust this completely to base myself on for all scenarios? Definitely not, but it’s something!
When using the DirectQuery RefreshSchedule, you can call this API for every dataset. The ones that actually return a schedule are either DirectQuery or LiveConnection. Based on the DataSourceType (pretty much exclude Power BI / Analysis Services), the returned records are datasets which have the default behaviour for caching enabled. Since you have to loop over all your datasets individually, you have a high risk of running into limits on this one, and an additional chance that someone disabled the caching behaviour.
To conclude this, the most durable solution I found there was to use the Dataset Information, but this has not yet been tested with Composite Models v1, or Direct Query over Azure Analysis Services and Power BI Datasets.
5. “GetScanResult” is the only API that returns Dataset endorsements
- A – TRUE
- B – FALSE
Explanation: At the time of writing (and airing), this is definitely the case! GetScanResult is part of the asynchronous mechanism to incrementally fetch all the lineage information on your Power BI Tenant. Essentially it’s the same that’s happening under the covers of Azure Purview, and this allows you to build your own solution.
To successfully do this, you need to handle the calls for modified workspaces, poll for the request state, and then get the results back. If you’re fairly new to APIs & PowerShell, this is definitely daunting. Luckily, there’s community resources out the there to help us get started.
6. Using a Service Principal (SPN) I can use these read-only Admin APIs
- A – GetDatasetsAsAdmin (Datasets)
- B – AddUserAsAdmin (Groups)
- C – GetDatasourcesAsAdmin (Datasets)
- D – GetCapacitiesAsAdmin (Capacities)
Answer: C, D
Explanation: Read-only Admin APIs and Service Principal authentication were only announced in December 2020, but they made a really big difference for the type of things we’re doing. They allow us to use an Azure AD App Registration, and extract the metadata details on our Power BI Tenant we need, fully unattended! Right now, there’s only an initial batch of APIs that are allowed for SPN authentication, but I expect more to arrive over time.
The only thing I find a bit wonky right now is that we can call GetDataSourcesAsAdmin, which requires a DatasetID. Yet we can not call GetDatasetsAsAdmin to help us get started. Meaning, if we want to iterate over datasets unattended, we have to call a different API (such as GetGroupsAsAdmin, $expand datasets, or GetDatasets) to help us get started. Using the GetCapacitiesAsAdmin, we can return all the info we need on the capacities we have in our organisation.
AddUsersAsAdmin is not allowed, which makes sense as it’s technically not a read-only Admin API.
7. Using “GetGroupsAsAdmin” I can use the $Expand parameter for
- A – datasets
- B – datasources
- C – users
- D – apps
Answer: A, C
Explanation: GetGroupsAsAdmin is my bread and butter for getting most details out of my Power BI Tenant, especially in combination with $Expand. Before this existed, we had to individually loop over all the object (nested calls most of the time) to return the required results. For larger tenants, this meant bumping into API Limits all .. the .. time .. The GetGroupsAsAdmin API is called once for every workspace, and will return all the specified information in a single go. There’s a limit for 5000 workspaces per call, but you can work your way around this by intelligently looping over this.
The $expand parameter can be used for users, reports, dashboards, datasets, dataflows, workbooks. Meaning datasources and apps are not a part of this.
8. I can use the (Datasets) UpdateDatasources API to change a parametrized Connection String
- A – TRUE
- B – FALSE
Explanation: We can use API calls to alter connection strings, for instance to change between DEV/PRD environments, or perform a migration. One of those options is using the UpdateDatasources API, to do so. When using DirectQuery datasources, I’ve had some real issues when using a parametrized ConnectionString, as it would not allow it. When reading the Docs page, it actually does outline this a restriction, and tells us to use the UpdateParameters API to change the parameters, and not the connection string. In full transparency, I’d look at Rebind Report operations, or External Tools to assist in this process, as I had some kinks to work out when doing this.
9. To safely unassign a Workspace from a Capacity using APIs I can
- A – Delete the workspace, it can not be unassigned through the API
- B – Use “CapacityAssignmentStatus” with -UnAssign and Workspace GUID
- C – Use “UnassignFromCapacity” with the workspace GUID
- D – Use “AssignToCapacity” with an empty GUID (0000..00) for CapacityID
Explanation: Off the bat, let me emphasise on the word “safely” in the question. When deleting a workspace it’s no longer assigned to the workspace, but we also lost the content that was included in this..
The only real solution to this is using the AssignToCapacity API, with an empty GUID (00000000-0000-0000-0000-000000000000), and thus assigning to a default capacity (the shared one for Power BI Pro). CapacityAssignmentStatus is an actual API, but it’s only used for getting the status reports on certain workspaces, when making the switch.
UnassignFromCapacity does not exist as an official API, and is something I completely made up 😃
10. To inventarise who CAN use PPU (Premium Per User) features, I have to:
- A – Check the Power BI Activity Log
- B – Refer to the PPU Tab in the Power BI Admin Portal
- C – Grab licensing information (ie. through Graph API)
- D – Check Power BI Workspace information
Answer: A, C, D
Explanation: First off, I have to explain ‘inventarise’, as it caused some confusion during the quiz. My meaning for this is the assemble a list of users who have the licenses and workspace access to perform PPU activities. To get the license information, we have to look at 2 different sources. The easiest one is the licenses that were assigned by a License or Power BI Admin, and these be extracted through the Graph API for instance. The hardest one is the In-Product Trial Experience, where users opt-in for the paid trial and get a free 60 day access pass to take it for a spin. The only way we can extract that information is a part of the Power BI Activity Log (or Office 365 Audit Log). Specifically we need to look for the ‘OptInForProTrial’, and ‘OptInForExtendedProTrial’ actions.
Then, we need to look at workspaces that are assigned to the Premium Per User reserved capacity, which we can do with the GetGroupsAsAdmin API, preferred with the $expand on users, to return the access list for the workspace. Luckily, I made a write-up on this process a while back, that holds some information on what you can do to prevent it, and steps to figure out who’s doing it.
One remark, which was stressed by Sir Saxton of Cubes as well, is the essential purpose of storing the Activity Logs as soon as you can. Due to the limited retention (30 days) you can not go back in time to when these licenses were made available. This is just one of the use cases where the Activity Log comes to the rescue, and I run into different uses on a daily basis. If there’s anything you should learn from this, it’s to get that extraction and retention up and running, ASAP!
I had lots of preparing the questions, and co-hosting the quiz with Just. I’m hoping people have learned something new on this, and that they’ll dabble into some of these things themselves as well. But, why not join us for the real deal next time?
Next time we’re doing the Power BI Quiz is on Wednesday April 7th at 8PM UTC+1, with the co-host to be announced
Subscribe to Just’s YouTube channel, and you’ll get the notifications for it as well
Speaking At : Data Event Vienna 2021 (SQLSaturday #1015), January 15th 2021
This Friday, I’m coming out of my hibernation for presenting at remote events, and it’s a special one. The final SQL Saturday as we know it will be held virtually in Vienna, on Friday January 15th 2021. PASS is dissolving that same day, and the future of the SQL Saturday brand is unsure. SQL Saturday has been a very important part of my community engagements throughout the past years. From attending in Utrecht, to speaking at my first SQL Saturday in Munich, to helping to organise our own SQL Saturday Belgium. It’s been one heck of a ride, to say the very least.
I’ll be talking about Impactful Data Visualisations, and some things you can keep in mind to design them (Session Details). Kicking off at 10:15AM, you can find me in the “Power BI & Power Platform 1” room. Apart from my usual ramblings, there’ll be a stellar lineup with interesting topics to cater to your every needs. Registration is free, and open until Friday. Head over to their registration page, to join in on the fun!
SQL Saturday Vienna (1015)
Before digging into this, let’s be clear about 1 thing. When Power BI Premium Per User (PPU for short) and Power BI Premium v2 were announced at Ignite 2020, the collective user base of Power BI rejoiced, and I am happy to be among them. PPU is going to be an excellent addition to leverage Premium features, without having to pay for an entire P-sku, or deal with spinning up an A-sku. Left, right, and center, we’re seeing interesting new use cases pop up to make the most out of PowerBI. And I .. LOVE IT!
I do have some concerns right now, which will likely be taken away as we get closer to General Availability. An obvious one is that as of now (January 2021), there’s no public details about the pricing. By default, users can assign themselves a PPU license, and start building things as they see fit. Self-Service BI is great, especially when there’s some guard rails in place, and when a Data Culture is actively stimulated. Matthew Roche’s series, goes into this in a splendid fashion.
The other day, I was chatting with one of my clients about Premium Per User, and I gave them the practical guidance to not build any production level dependencies based on PPU features or workspaces, until some of the unknowns have been cleared up. If there’s end users relying on this for their actual daily job, then I’m calling it a production level dependency. Right now, these are preview features, and this client is not actively monitoring changes in the Power BI Landscape.
Shortly after, I got a message that some of their business users did build actual production reports and dataflows in PPU workspaces. And, they were not sure who in the company actually has access to PPU. And that’s where chase down the rabbit hole began 😃
Who can use PPU Features in my tenant?
By default, every user will be able to assign themselves a PPU trial license, and start experimenting. Users can also be assigned a PPU License through the M365 Admin Center, as outlined in the Power BI Premium per user public preview now available post by Chris Finlan.
This behaviour can be allowed, disabled, or scoped to a specific group of users. By default it will look the same, as shown in the image below.
In combination with the tenant setting for who can create workspaces, this will control the PPU experience, and how freely users can create experiences for themselves. By default, every user in your organisation can create new workspaces, and these will automatically be in the New Workspace Experience (v2 Workspaces).
Okay now, but who’s actually doing something with PPU Features in my tenant?
Attempt 1 – List out PPU Workspaces, get Capacity Information
My first thought was to grab an overview of workspaces that are on Dedicated Capacity using the Power BI PowerShell cmdlets (Get-PowerBIWorkspaces & Get-PowerBICapacity), and filter out workspaces that are linked to a PPU Capacity.
Then, for those workspaces, get the Users and/or Groups that have access to them using an InvokeRestMethod to the Admin API (GetGroupsAsAdmin, expand=users)
This works, to a certain extent. A few of the exceptions I thought of so far.
- Just because a user is in a group that has access, doesn’t mean they actually have a PPU Subscription.
- A user could have a PPU Subscription, and have left the PPU Workspaces.
Attempt 2 – Get License Information
We can grab the license information through the MS Online cmdlets (Get-MsolUser), AzureAD cmdlets (GetAzureADUserLicenseDetail), or the Graph API (https://graph.microsoft.com/v1.0/me/licensedetails).
The MS Online cmdlets don’t support authenticating through Service Principals, so don’t actually build any dependencies on this.
All these options give me a nice overview of :
- Power BI Standard (Free) with ServicePlan: BI_AZURE_P0
- Power BI Pro with ServicePlan : BI_AZURE_P2
- Power BI Premium Per User with ServicePlan: BI_AZURE_P3
- Office 365 E5 Subscriptions
My real concern is that this only lists the users that have been assigned this as a (purchased) Product, as part of the “skuPartNumber” PBI_PREMIUM_PER_USER.
Meaning, only the users that were assigned this subscription by an admin will show up this way. Usually, they will be tied to groups as well, which can easily be exported to get an overview.
Right now, I’m still missing the most important group of all, which is those that have assigned themselves an in-product trial version for Premium Per User from within the Power BI Service.
These are the users that are potentially flying under the radar, and exactly the ones that we want to identify
Attempt 3 – Microsoft 365 Admin Center
After being hinted by Jan Pieter Posthuma to read through the Self-service purchase FAQ | Microsoft Docs, there’s a screen where you can get an overview of paid trials. Alas, this isn’t giving me any results I expect to see, and seems to be tailored specifically to M365 products.
My own user has a trial license from within the Power BI Service, and this also doesn’t show up in the personal overview of Subscriptions & Licenses. Which explains to me why it doesn’t show up in the Admin Center either.
Attempt 4 – Track the Power BI Activity Log
Similar to the “OptInForProTrial” and “OptInForExtendedProTrial” activities that appear in the Power BI Activity Log when user assign themselves an in-product trial Power BI Pro Subscription, I was hoping to see the same for Power BI Premium Per User subscriptions. I’ve been using this method to track users getting a Pro Trial, and potentially will be needing an actual paid license within 60 days.
I assigned a test user an in-product trial subscription, and grabbed the activity logs the day after. This activity actually also shows up as “OptInForProTrial”, which made sense as soon I actually read through the post from Chris Finlan again, specifically the How To Get Started, Existing Users section.
Existing Free users – You will be given access to the PPU capabilities during the public preview period if you choose to opt-in to a Pro trial as an added benefit. Since it will happen behind the scenes, the UI in the portal will still reflect that you have a Pro trial, but you will be able to access all the PPU features.
For the set of customers that have disabled their in-product trial experience, your Microsoft 365 tenant admin can opt-in to a trial experience there.
Going out on a limb, I’m assuming the in-product trial functions the same way for both Pro and PPU subscriptions, and this is why there’s no distinction possible in the User Interface or Activity Logs. Likely we’ll see some changes be made in the future, but I’ve no guarantees to back this up.
Bringing the attempts together
To circle back to an exhaustive overview, I’m rolling with a combination of Attempts 1, 2, and 4.
- Getting the actual licensing information will give us the details on the users that have been assigned a license by an admin. (Attempt 2)
- Monitor the Activity Log for “OptInForProTrial” activities
Right now the trial experience is the same for Pro and PPU Subscriptions, which means this is about as exhaustive as the list is going to be. If you don’t have an extraction set up for Activity Logs, I suggest you do it as soon as possible.
Then, having identified the PPU Workspaces in Attempt 1, we can specifically track these in the Activity Log. If we’re seeing some specific activities in there by certain users, this could point out there’s usage going on beyond testing and development.
Getting this question, I underestimated the steps to get a complete answer, which is why I thought it could prove to be useful for other people. My biggest learning here, was that free users can actually assign themselves a PPU Subscription, but that it looks like a Pro Subscription in the user interface. Definitely something to keep in mind when determining your licensing strategy for Power BI.
Most importantly, this outlines once more why grabbing and storing the Power BI Activity Logs as soon as possible is crucial to understanding the usage in your tenant, and being able to act on that. Looking for an example on how to extract these logs? There’s some examples on Github by Melissa Coates, Alex Powers, or Aaron Nelson which can easily help you get started.
The hardest part to track are the in-product trials, and there’s varying opinions on these. I’ve seen organisations disable the “Allow users to try paid features” tenant setting, where users always have to be assigned a subscription, meaning they have to be available as well.
Additionally, you want to make sure your corporate workflows are fluid and fast enough to process incoming requests for subscriptions.
It’s all about walking the fine line between Self-Service BI and Managed Corporate BI, and finding the strengths your organisations can play to.
Thanks for reading!