Skip to content

[Hubs] Ingestion memory errors due to use of strcat() #2150

@flanakin

Description

@flanakin

🐛 Problem

Ingestion memory limit errors due to how strings are being checked for values:

We've been having identical issues recently, we're still in the process of testing the solutions and attempting to solve other similar issues, but I'll just put here what we've found so far.

What we did is modify the KQL function Costs_transform_v1_2 in the Ingestion Database. On line 124 we changed
"and isnotempty(strcat(x_SkuMeterId, x_SkuOfferId))" into
"and isnotempty(x_SkuMeterId) and isnotempty(x_SkuOfferId)".
This resulted in greatly reduced memory usage of the KQL function and allow Data Factory pipelines to run without errors (so far !), while we previously had frequent intermittent errors for several files.
Before the change, we would see errors like the one below following the pipeline ingestion_ETL_dataExplorer failing in Data Factory :
Image

All errors were pointing towards the Costs_transform_v1_2 function hitting memory limits, considering the amount of data it attempts to process. It kept facing memory limit errors when the data ingested was too large (despite the fact that we checked "File partitioning" option in Costs Management Exports.)

Image

Originally posted by @Pasc104 in #2110

🔧 Environment

TODO: Complete the following (remove any that are not applicable):

  • FinOps hub version: 12+ (?)
  • Billing account type: Unknown
  • Power BI report type: N/A
  • Cost Management export: Unknown

ℹ️ Additional context

We should look for all places where this pattern is used and fix them all. Instead of using strcat() to check for multiple empty strings, check them all individually.

Also review other uses of strcat() to ensure we don't have other potential risky areas that haven't been surfaced yet.

🙋‍♀️ Ask for the community

We could use your help:

  1. Please vote this issue up (👍) to prioritize it.
  2. Leave comments to help us solidify the vision.

Metadata

Metadata

Labels

Skill: KQLKQL queries and Data Explorer integrationTool: FinOps hubsData pipeline solution

Type

No fields configured for Bug.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions