Add FabricNotebookAPITools to samples folder#509
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new sample under samples/ that provides Fabric Notebook (PySpark) utilities for tenant-wide lakehouse table inventory via Fabric REST APIs, following up on the request to relocate these assets into the samples area.
Changes:
- Added
GetAllTables.pysample script to enumerate workspaces → lakehouses → tables and register results as a Spark temp view. - Added a
README.mddescribing the tool, output schema, limitations, and example query.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| samples/notebook_fabric_api_tools/README.md | Documents the FabricNotebookAPITools sample and the fabric_lakehouse_inventory temp view schema/query usage. |
| samples/notebook_fabric_api_tools/GetAllTables.py | Implements paginated REST calls to inventory tables across the tenant and exposes results as a Spark temp view for SQL querying. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| workspace_failed = False | ||
|
|
There was a problem hiding this comment.
workspace_failed is assigned but never used. This adds noise and makes it look like downstream error handling is expected. Consider removing it, or use it to collect/report failed workspaces at the end (similar to schema_enabled_lakehouses).
| while url: | ||
| resp = requests.get(url, headers=api_headers) | ||
| resp.raise_for_status() | ||
| body = resp.json() |
There was a problem hiding this comment.
Network call robustness: requests.get(...) is called without a timeout inside a potentially long-running pagination loop. A transient hang will stall the whole notebook run. Consider passing an explicit timeout (and optionally retry/backoff for 429/5xx) to make the sample more reliable.
| elif body.get("continuationToken"): | ||
| separator = "&" if "?" in url else "?" | ||
| base_url = url.split("?")[0] if "?" in url else url | ||
| url = f"{base_url}{separator}continuationToken={body['continuationToken']}" |
There was a problem hiding this comment.
Pagination bug: when falling back to continuationToken, separator is computed from the current url (which already contains ?continuationToken=...), but then you strip the query into base_url. This can generate URLs like .../tables&continuationToken=... (missing ?) on the 2nd page and break pagination. Consider building the next URL via urllib.parse (preserve existing query params) or recompute the separator after base_url.
MarkPryceMaherMSFT
left a comment
There was a problem hiding this comment.
@hurtn This looks okay, can you confirm and it okay, merge it.
dataassets1
left a comment
There was a problem hiding this comment.
Hi @MarkPryceMaherMSFT , thanks for the review. I've addressed all of Copilot's feedback in :
- Added a 30-second timeout on every requests.get() call so the notebook fails fast rather than hanging indefinitely if the API stalls.
- Fixed the pagination URL construction in get_all_pages using urllib.parse, so existing query parameters are preserved and no malformed URLs are produced on the continuationToken fallback path.
- Removed the unused SparkSession import.
- Replaced the unused workspace_failed flag with a failed_workspaces list that's summarized at the end of the run, so errors are actually visible instead of swallowed.
- Made get_all_pages resilient to both "value" and "data" result keys, so it works uniformly across the workspaces, lakehouses, and tables endpoints.
- Guarded the final display(spark.sql(...)) inside the if inventory_records: block so it doesn't fail on an empty tenant (when the temp view doesn't exist).
- Ran it in a Fabric notebook against my workspace — temp view is created and the skip/failure summaries print as expected.
@hurtn — ready for your look whenever you have a moment. Thanks!
|
@hurtn Please review. |
|
Colleagues, if anyone can serve as a second reviewer for this PR, I would be very grateful! |
Pull Request
Follow-up to #483 — moved files to samples/ as requested by @hurtn. Provides a set of Spark Notebook tools for tenant-wide lakehouse table inventory.
Pull Request (PR) description
Task list
build.ps1 -ResolveDependency -Tasks build, test).