Skip to content

Import 507-ai-inference folder from leak-detection-e2e branch on Azure DevOps.#558

Open
TechPreacher wants to merge 6 commits into
microsoft:mainfrom
TechPreacher:integration/507-ai-inference-imported-merge-2026-05-21
Open

Import 507-ai-inference folder from leak-detection-e2e branch on Azure DevOps.#558
TechPreacher wants to merge 6 commits into
microsoft:mainfrom
TechPreacher:integration/507-ai-inference-imported-merge-2026-05-21

Conversation

@TechPreacher
Copy link
Copy Markdown

Pull Request

Description

This PR selectively merges the validated 507 AI inference imported changes into the current 507 implementation while preserving the hardened runtime and deployment posture already in the branch.

The merge restores the expected service bootstrap and inference path, aligns the chart/runtime model directory contract, updates the ONNX backend implementation and supporting example/docs, and hardens the ACR container build by fixing ONNX Runtime linking and removing dead scaffold from the image.

Related Issue

Relates to the 507 AI inference imported merge work tracked in .copilot-tracking/changes/2026-05-21/507-ai-inference-imported-merge-changes.md.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Component modification or addition
  • Documentation update
  • CI/CD pipeline change
  • Other (please describe):

Implementation Details

  • Restored CLI override handling in charts/gen-patch.sh so image and namespace values can be supplied explicitly without changing default behavior.
  • Reconciled the deployment environment contract to use the runtime model directory setting expected by the service.
  • Restored YAML bootstrap and model loading in services/ai-edge-inference/src/main.rs while keeping the current MQTT session orchestration path.
  • Ported the functional ONNX backend flow into services/ai-edge-inference-crate/src/backends/onnx.rs and updated engine.rs to preserve the required output mapping and safer metadata handling.
  • Updated the crate example and rustdoc example so the test suite and doctests compile cleanly against the current API.
  • Fixed Dockerfile.acr to use ONNX Runtime system linking correctly and removed the unused /installroot scaffold from the final image path.
  • Documented the local protoc/protobuf prerequisite needed for manual builds.

Testing Performed

  • Unit tests
  • Integration tests
  • Manual validation
  • Other:

Validation performed for this branch:

  • cargo check --manifest-path src/500-application/507-ai-inference/services/ai-edge-inference-crate/Cargo.toml
  • cargo test --manifest-path src/500-application/507-ai-inference/services/ai-edge-inference-crate/Cargo.toml
  • cargo check --manifest-path src/500-application/507-ai-inference/services/ai-edge-inference/Cargo.toml
  • bash -n src/500-application/507-ai-inference/charts/gen-patch.sh
  • kustomize build src/500-application/507-ai-inference/charts
  • docker build -f src/500-application/507-ai-inference/services/ai-edge-inference/Dockerfile src/500-application/507-ai-inference/services
  • docker build -f src/500-application/507-ai-inference/services/ai-edge-inference/Dockerfile.acr src/500-application/507-ai-inference/services

Validation Steps

  1. Run the crate checks and tests in src/500-application/507-ai-inference/services/ai-edge-inference-crate/.
  2. Render the chart with kustomize build src/500-application/507-ai-inference/charts.
  3. Build both service container images from src/500-application/507-ai-inference/services.
  4. Verify the generated patch script still accepts explicit overrides and produces valid output.

Checklist

  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed
  • I have run terraform fmt on all Terraform code
  • I have run terraform validate on all Terraform code
  • I have run az bicep format on all Bicep code
  • I have run az bicep build to validate all Bicep code
  • I have checked for any sensitive data/tokens that should not be committed
  • Lint checks pass (run applicable linters for changed file types)

Security Review

  • No credentials, secrets, or tokens are hardcoded or logged
  • RBAC and identity changes follow least-privilege principles
  • No new network exposure or public endpoints introduced without justification
  • Dependency additions or updates have been reviewed for known vulnerabilities
  • Container image changes use pinned digests or SHA references

Additional Notes

The branch contains an imported merge of validated 507 AI inference changes rather than a single isolated fix, so the PR spans runtime, chart, crate, and container build updates.

Screenshots (if applicable)

None.

This commit introduces several improvements across the AI inference service and development experience:

*   **Refactor(ai-edge-inference-crate):** Implements thread-safe metric and backend statistics updates using `Mutex` and `RwLock` with graceful error handling for poisoned locks, significantly improving service robustness and reliability of reported metrics.
*   **Feat(deployment):** Updates the `busybox` init container image to include a SHA256 digest, enhancing image immutability and security. Changes the default `RUST_LOG` level to `info` for reduced verbosity in deployed environments.
*   **Chore(vscode):** Adds a comprehensive set of VS Code tasks for Terraform (validation, linting, docs, testing, planning), Markdown linting, and "Adhoc" tasks for Cargo, Kustomize, and YAML linting, streamlining developer workflows.
*   **Docs(readme):** Updates the README to include `protoc` as a manual setup prerequisite.
*   **Fix(scripts):** Quotes variables in `gen-patch.sh` for improved shell script robustness.
@TechPreacher TechPreacher requested a review from a team as a code owner May 22, 2026 05:47
@github-actions
Copy link
Copy Markdown

📚 Documentation Health Report

Generated on: 2026-05-22 06:02:21 UTC

📈 Documentation Statistics

Category File Count
Main Documentation 222
Infrastructure Components 197
Blueprints 39
GitHub Resources 25
AI Assistant Guides (Copilot) 17
Total 500

🏗️ Three-Tree Architecture Status

  • ✅ Bicep Documentation Tree: Auto-generated navigation
  • ✅ Terraform Documentation Tree: Auto-generated navigation
  • ✅ README Documentation Tree: Manual README organization

🔍 Quality Metrics

  • Frontmatter Validation:
    success
  • Link Validation: success

This report is automatically generated by the Documentation Automation workflow.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 22, 2026

Codecov Report

❌ Patch coverage is 0% with 74 lines in your changes missing coverage. Please review.
✅ Project coverage is 32.09%. Comparing base (41b933a) to head (63491e9).

Files with missing lines Patch % Lines
...nce/services/ai-edge-inference-crate/src/engine.rs 0.00% 62 Missing ⚠️
...i-inference/services/ai-edge-inference/src/main.rs 0.00% 12 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #558      +/-   ##
==========================================
- Coverage   32.41%   32.09%   -0.32%     
==========================================
  Files          40       40              
  Lines        5902     5960      +58     
==========================================
  Hits         1913     1913              
- Misses       3989     4047      +58     
Flag Coverage Δ
rust 32.09% <0.00%> (-0.32%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...erence/services/ai-edge-inference-crate/src/lib.rs 100.00% <ø> (ø)
...i-inference/services/ai-edge-inference/src/main.rs 0.00% <0.00%> (ø)
...nce/services/ai-edge-inference-crate/src/engine.rs 0.00% <0.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Copy Markdown

📚 Documentation Health Report

Generated on: 2026-05-22 07:06:04 UTC

📈 Documentation Statistics

Category File Count
Main Documentation 222
Infrastructure Components 197
Blueprints 39
GitHub Resources 25
AI Assistant Guides (Copilot) 17
Total 500

🏗️ Three-Tree Architecture Status

  • ✅ Bicep Documentation Tree: Auto-generated navigation
  • ✅ Terraform Documentation Tree: Auto-generated navigation
  • ✅ README Documentation Tree: Manual README organization

🔍 Quality Metrics

  • Frontmatter Validation:
    success
  • Link Validation: success

This report is automatically generated by the Documentation Automation workflow.

@github-actions
Copy link
Copy Markdown

📚 Documentation Health Report

Generated on: 2026-05-22 07:09:35 UTC

📈 Documentation Statistics

Category File Count
Main Documentation 222
Infrastructure Components 197
Blueprints 39
GitHub Resources 25
AI Assistant Guides (Copilot) 17
Total 500

🏗️ Three-Tree Architecture Status

  • ✅ Bicep Documentation Tree: Auto-generated navigation
  • ✅ Terraform Documentation Tree: Auto-generated navigation
  • ✅ README Documentation Tree: Manual README organization

🔍 Quality Metrics

  • Frontmatter Validation:
    success
  • Link Validation: success

This report is automatically generated by the Documentation Automation workflow.

@kgmwang1
Copy link
Copy Markdown
Contributor

Thanks for this addition to the repo! I will add a couple small comments but everything mostly looks great



let models_dir = PathBuf::from(
std::env::var("MODELS_DIRECTORY").unwrap_or_else(|_| "/models".to_string()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the Dockerfile, the env variable is MODEL_DIRECTORY and in /src/500-application/507-ai-inference/services/ai-edge-inference/src/config.rs, it is MODELS_DIRECTORY

}

/// Determine output shape from the flat output vector and model metadata
fn infer_output_shape(&self, output_len: usize, model: &OnnxModel) -> Vec<usize> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the shape value returned on line 240 should be the source of truth for the tensor shape. The value derived in this function may be incorrect if labels are missing. Suggesting the value on 240 be propagated and used as the value of shape.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants