Skip to content

feat(fhirpath): storage-backed resolve() for server-stored resources (#167)#182

Open
mauripunzueta wants to merge 2 commits into
mainfrom
feat/167-fhirpath-resolve-storage
Open

feat(fhirpath): storage-backed resolve() for server-stored resources (#167)#182
mauripunzueta wants to merge 2 commits into
mainfrom
feat/167-fhirpath-resolve-storage

Conversation

@mauripunzueta

Copy link
Copy Markdown
Contributor

Summary

Closes #167 (follow-up to #166).

resolve() previously dereferenced only contained resources and resources already in the evaluation context; a relative Type/id reference to a resource living in a storage backend fell back to a typed stub {resourceType: Type}. This adds eager, tenant-scoped pre-hydration so server-side view evaluation can dereference stored references — e.g. Observation.subject.resolve().name.

Approach

A — eager pre-hydration (the issue's recommended starting point), mirroring the existing trusted-server remote-resolve() prefetch (helios_sof::remote_fetch) but sourcing resources from a tenant-scoped ResourceStorage instead of HTTP. The evaluator stays synchronous: all storage I/O happens in async code before the engine runs — no block_on on async worker threads, no evaluator changes.

What changed

  • helios-persistence — new sof::reference_resolver module:
    • collect_missing_references — a pure, order-stable scanner that extracts the distinct relative Type/id references in the resources under evaluation, excluding those already in scope.
    • StorageReferenceResolver trait + StorageBackedResolver over any ResourceStorage: tenant-scoped (read_batch), FHIR-version-matched, fan-out capped (DEFAULT_MAX_FANOUT), deduped, one level deep, no network (relative Type/id only — absolute URLs / urn: / #fragment excluded; absolute-URL resolution remains the separate allowlisted concern of remote_resolver).
  • InProcessSofRunner gains an optional resolver (with_reference_resolver): per run it collects the scanned resources' missing references, prefetches them, and folds them into the FHIRPath resolution pool via process_chunk_with_external. With no resolver, behavior is unchanged (stub/empty fallback).
  • Wired into the S3 backend's in-process runner — the FHIRPath-engine path. S3Backend serves as its own tenant-scoped resolver.
  • helios-sof: expose parse_json_to_fhir_resource_pub (was pub(crate)) so the persistence prefetch can build the typed external pool.

Scope boundary

This targets the in-process FHIRPath engine path (S3 and S3-primary composite backends), which is where views are evaluated by the engine. The SQLite/PostgreSQL in-DB runners compile views to SQL and do not compile resolve() — such views are already Uncompilable422 there; lowering resolve() to SQL joins is a separate, much larger feature out of scope for #167.

Acceptance criteria

  • Server-side FHIRPath resolves a relative Type/id reference to a stored resource (tenant-scoped) — resolves_stored_reference_during_view_run, resolves_stored_resource_for_owning_tenant.
  • No behavior change without a resolver (CLI / in-memory eval) — without_resolver_reference_is_not_dereferenced; the runner defaults to no resolver.
  • Evaluator remains sync; no block_on on async worker threads — prefetch is async, before spawn_blocking.
  • Tenant isolation enforced and tested — does_not_resolve_across_tenants.
  • Tests: stored hit, cross-tenant miss, not-found fallback, version match — all present (version match under R4B).

Testing

Pure-scanner unit tests (5), StorageBackedResolver integration tests vs SQLite (4), end-to-end runner tests (2). Full helios-persistence lib suite (672) green; cargo fmt clean; cargo clippy --all-targets -D warnings clean on helios-persistence and helios-sof.

🤖 Generated with Claude Code

Follow-up to #166 (#167). resolve() previously dereferenced only contained
resources and resources already in the evaluation context; a relative Type/id
reference to a resource living in a storage backend fell back to a typed stub.
This adds eager, tenant-scoped pre-hydration so server-side view evaluation can
dereference stored references — e.g. Observation.subject.resolve().name.

Approach A (eager pre-hydration), mirroring the existing trusted-server
remote-resolve() prefetch but sourcing from ResourceStorage instead of HTTP. The
evaluator stays synchronous: all I/O happens in async code before the engine runs.

- helios-persistence: new sof::reference_resolver module — a pure relative-ref
  scanner (collect_missing_references) plus a StorageReferenceResolver trait and a
  StorageBackedResolver over any ResourceStorage. Tenant-scoped (read_batch),
  FHIR-version-matched, fan-out capped, deduped, one level deep, no network
  (relative Type/id only — absolute URLs / urn: / fragments excluded).
- InProcessSofRunner gains an optional resolver (with_reference_resolver): per run
  it collects the scanned resources' missing references, prefetches them, and folds
  them into the resolution pool via process_chunk_with_external. Absent resolver →
  unchanged stub/empty fallback.
- Wired into the S3 backend's in-process runner (the FHIRPath-engine path). The
  SQL runners (SQLite/PG) don't compile resolve() — that remains Uncompilable/422
  and is out of scope here.
- helios-sof: expose parse_json_to_fhir_resource_pub (was pub(crate)) so the
  persistence prefetch can build the typed external pool.

Tests: pure-scanner unit tests; StorageBackedResolver integration tests vs SQLite
(stored hit, cross-tenant miss, not-found fallback, version match); end-to-end
runner tests proving resolve() dereferences a stored Patient during a view run and
that the no-resolver path is unchanged. Full persistence lib suite (672) green;
clippy clean.
@claude

claude Bot commented Jun 23, 2026

Copy link
Copy Markdown

Helios FHIR Server

Helios FHIR Server

The Helios FHIR Server is an implementation of the HL7® FHIR® standard, built in Rust for high performance and optimized for clinical analytics workloads. It provides modular components that can be run as standalone command-line tools, integrated as microservices, or embedded directly into your data analytics pipeline.

Why Helios FHIR Server?

  • 🚀 Blazing Fast: Built in Rust for maximum performance and minimal resource usage
  • 📊 Analytics-First: Optimized for clinical data analytics and research workloads
  • 🔧 Modular Design: Use only what you need - from FHIRPath expressions to full server capabilities
  • 🌐 Multi-Version Support: Work with R4, R4B, R5, and R6 data in the same application
  • 🛠️ Developer Friendly: Excellent error messages, comprehensive documentation, and CLI tools

What People Build with the Helios FHIR Server

  • Clinical Research Platforms: Transform FHIR data into research-ready datasets using SQL-on-FHIR
  • Real-time Analytics Dashboards: Process streaming FHIR data for operational insights
  • Data Quality Tools: Validate and profile FHIR data using FHIRPath expressions
  • ETL Pipelines: Extract and transform FHIR data for data warehouses and lakes
  • Healthcare APIs: Build high-performance FHIR-compliant REST APIs
  • Healthcare Analytics: Analyze patient cohorts at scale

Quick Start

The Helios FHIR Server includes several components:

  • hfs — the main FHIR server
  • hts — the FHIR Terminology Server
  • fhirpath-cli and fhirpath-server — FHIRPath evaluation
  • sof-cli and sof-server — SQL-on-FHIR transformation
  • pysof — Python bindings for SQL-on-FHIR

See Core Components for details on each.

The server supports SQLite, PostgreSQL, Elasticsearch, and S3 in various configurations — see Storage Backends for setup options.

Using Release Binaries

Pre-built binaries are available on the GitHub Releases page. Download the appropriate archive for your platform and extract it.

Windows users: Add .exe to all binary names (e.g., hfs.exe, fhirpath-cli.exe, sof-cli.exe).

The following are independent examples showing how to run each executable — pick whichever ones apply to your use case:

# FHIR server (access at http://localhost:8080/metadata)
./hfs

# FHIR Terminology Server (access at http://localhost:8090/metadata)
./hts

# Evaluate a FHIRPath expression
echo '{"resourceType": "Patient", "id": "123"}' | ./fhirpath-cli 'Patient.id'

# Transform FHIR Bundle to CSV using SQL-on-FHIR
./sof-cli --view examples/patient-view.json --bundle examples/patients.json

# Transform NDJSON file to CSV using SQL-on-FHIR
./sof-cli --view examples/patient-view.json --bundle examples/patients.ndjson

# SQL-on-FHIR HTTP server (POST to http://localhost:8080/ViewDefinition/$viewdefinition-run)
./sof-server

# FHIRPath HTTP server (POST expressions to http://localhost:3000/fhirpath)
./fhirpath-server

Using Docker Images

Pre-built multi-arch Docker images (amd64/arm64) are available on GitHub Container Registry.

# FHIR Server (default: R4, in-memory SQLite, port 8080)
docker run -p 8080:8080 ghcr.io/heliossoftware/hfs:latest

# With persistent SQLite storage
docker run -p 8080:8080 -v hfs-data:/data -e HFS_DATABASE_URL=/data/fhir.db ghcr.io/heliossoftware/hfs:latest

# With PostgreSQL
docker run -p 8080:8080 \
  -e HFS_STORAGE_BACKEND=postgres \
  -e HFS_DATABASE_URL="postgresql://user:pass@host:5432/fhir" \
  ghcr.io/heliossoftware/hfs:latest

# FHIRPath Server (port 3000)
docker run -p 3000:3000 ghcr.io/heliossoftware/fhirpath-server:latest

# SQL-on-FHIR Server (port 8080)
docker run -p 8080:8080 ghcr.io/heliossoftware/sof-server:latest

# FHIR Terminology Server (default: SQLite, port 8090)
docker run -p 8090:8090 ghcr.io/heliossoftware/hts:latest

# HTS with persistent SQLite storage
docker run -p 8090:8090 -v hts-data:/data -e HTS_DATABASE_URL=/data/hts.db ghcr.io/heliossoftware/hts:latest

See Environment Variables for all available configuration options.

Building From Source

Prerequisites

  1. Install Rust

    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
  2. Install LLD

    Linux (Ubuntu/Debian):

    sudo apt install clang lld

    Windows:

    Download a pre-build binary from llvm-project's GitHub page.

    macOS:

    LLD is not required for macOS.

  3. Configure config.toml

    Create or modify ~/.cargo/config.toml:

    [target.x86_64-unknown-linux-gnu]
    linker = "clang"
    rustflags = ["-C", "link-arg=-fuse-ld=lld", "-C", "link-arg=-Wl,-zstack-size=8388608"]
    
    [target.aarch64-apple-darwin]
    linker = "clang"
    rustflags = [
      "-C", "link-arg=-Wl,-dead_strip",
      "-C", "link-arg=-undefined",
      "-C", "link-arg=dynamic_lookup"
    ]
    
    [target.x86_64-pc-windows-msvc]
    linker = "lld-link.exe"
    rustflags = ["-C", "link-arg=/STACK:8388608"]
  4. Memory-constrained builds (optional):

💡 Tip: If you run out of memory during compilation on Linux, especially on high CPU core count machines, limit parallel jobs to 4 (or less):
bash export CARGO_BUILD_JOBS=4

Build and Install

# Clone the repository
git clone https://github.com/HeliosSoftware/hfs.git
cd hfs

# Build (R4 only by default). Uses workspace default-members and skips the Python bindings crate (pysof).
cargo build --release

# Or build with all FHIR versions
cargo build --release --features R4,R4B,R5,R6

# Build all workspace members (including pysof)
cargo build --workspace --release

Storage Backends

The Helios FHIR Server supports multiple storage backend configurations. Choose a configuration based on your search requirements and deployment scale.

Available Configurations

Configuration Search Capability Use Case
SQLite (default) Built-in FTS5 full-text search Development, testing, small deployments
SQLite + Elasticsearch Elasticsearch-powered search with relevance scoring Production deployments needing robust search
PostgreSQL Built-in full-text search (tsvector/tsquery) Production OLTP deployments
PostgreSQL + Elasticsearch Elasticsearch-powered search with PostgreSQL CRUD Production deployments needing RDBMS + robust search
S3 Object storage for CRUD, versioning, history, and bulk operations (no search) Archival, bulk analytics, cost-effective storage
S3 + Elasticsearch Elasticsearch-powered search with S3 CRUD Large-scale storage with full FHIR search

Running the Server

# SQLite (default) — no external dependencies
./hfs

# SQLite + Elasticsearch
HFS_STORAGE_BACKEND=sqlite-elasticsearch \
HFS_ELASTICSEARCH_NODES=http://localhost:9200 \
  ./hfs

# PostgreSQL
HFS_STORAGE_BACKEND=postgres \
HFS_DATABASE_URL="postgresql://user:pass@localhost:5432/fhir" \
  ./hfs

# PostgreSQL + Elasticsearch
HFS_STORAGE_BACKEND=postgres-elasticsearch \
HFS_DATABASE_URL="postgresql://user:pass@localhost:5432/fhir" \
HFS_ELASTICSEARCH_NODES=http://localhost:9200 \
  ./hfs

# S3 (requires AWS credentials via standard provider chain:
# https://docs.aws.amazon.com/sdkref/latest/guide/standardized-credentials.html)
HFS_STORAGE_BACKEND=s3 \
HFS_S3_BUCKET=my-fhir-bucket \
AWS_PROFILE=your-aws-profile \
AWS_REGION=us-east-1 \
  ./hfs

# S3 + Elasticsearch
HFS_STORAGE_BACKEND=s3-elasticsearch \
HFS_S3_BUCKET=my-fhir-bucket \
HFS_ELASTICSEARCH_NODES=http://localhost:9200 \
AWS_PROFILE=your-aws-profile \
AWS_REGION=us-east-1 \
  ./hfs

Environment Variables

Server

Variable Default Description
HFS_STORAGE_BACKEND sqlite Backend mode: sqlite, sqlite-elasticsearch, postgres, postgres-elasticsearch, mongodb, mongodb-elasticsearch, s3, or s3-elasticsearch
HFS_SERVER_PORT 8080 Server port
HFS_SERVER_HOST 127.0.0.1 Host to bind
HFS_BASE_URL http://localhost:8080 Base URL used in Location headers and Bundle links
HFS_DATABASE_URL fhir.db Database URL (SQLite path or PostgreSQL connection string)
HFS_DATA_DIR ./data Directory containing FHIR data files (search parameters)
HFS_DEFAULT_FHIR_VERSION R4 FHIR version (R4, R4B, R5, R6)
HFS_LOG_LEVEL info Log level (error, warn, info, debug, trace)

Limits & behavior

Variable Default Description
HFS_MAX_BODY_SIZE 10485760 Max request body size (bytes; applies to the decompressed body for compressed requests)
HFS_REQUEST_TIMEOUT 30 Request timeout (seconds)
HFS_DEFAULT_PAGE_SIZE 20 Default search result page size
HFS_MAX_PAGE_SIZE 1000 Maximum search result page size
HFS_ENABLE_REQUEST_ID true Enable request ID tracking
HFS_ENABLE_VERSIONING true Enable ETag versioning
HFS_RETURN_GONE true Return 410 Gone for deleted resources (vs 404)
HFS_REQUIRE_IF_MATCH false Require If-Match header for updates

Request bodies may be sent compressed (Content-Encoding: gzip, deflate,
br, or zstd); unsupported encodings are rejected with 415. Responses are
compressed when the client sends Accept-Encoding.

CORS

Variable Default Description
HFS_ENABLE_CORS true Enable CORS
HFS_CORS_ORIGINS * Allowed origins
HFS_CORS_METHODS GET,POST,PUT,PATCH,DELETE,OPTIONS Allowed methods
HFS_CORS_HEADERS Content-Type,Authorization,Accept,… Allowed headers

Multi-tenancy

Variable Default Description
HFS_DEFAULT_TENANT default Default tenant ID for requests without an X-Tenant-ID header
HFS_TENANT_ROUTING_MODE header_only Tenant routing: header_only, url_path, or both
HFS_TENANT_STRICT_VALIDATION false Error if URL and header tenant disagree
HFS_JWT_TENANT_CLAIM tenant_id JWT claim name for tenant

Terminology

Variable Default Description
HFS_TERMINOLOGY_SERVER (none) HTS base URL for :in/:not-in search and FHIRPath terminology functions (memberOf, subsumes)

Elasticsearch

Variable Default Description
HFS_ELASTICSEARCH_NODES http://localhost:9200 Comma-separated ES node URLs
HFS_ELASTICSEARCH_INDEX_PREFIX hfs ES index name prefix
HFS_ELASTICSEARCH_USERNAME (none) ES basic auth username
HFS_ELASTICSEARCH_PASSWORD (none) ES basic auth password

PostgreSQL (used to assemble a connection when HFS_DATABASE_URL is not set)

Variable Default Description
HFS_PG_HOST localhost PostgreSQL host
HFS_PG_PORT 5432 PostgreSQL port
HFS_PG_DBNAME helios Database name
HFS_PG_USER helios Database user
HFS_PG_PASSWORD (none) Database password
HFS_PG_MAX_CONNECTIONS 10 Connection pool size

MongoDB

Variable Default Description
HFS_MONGODB_URL / HFS_MONGODB_URI (none) MongoDB connection string
HFS_MONGODB_DATABASE helios Database name
HFS_MONGODB_MAX_CONNECTIONS 10 Connection pool size
HFS_MONGODB_CONNECT_TIMEOUT_MS 5000 Connection timeout (ms)

S3

Variable Default Description
HFS_S3_BUCKET hfs S3 bucket name (prefix-per-tenant mode)
HFS_S3_REGION (AWS provider chain) AWS region override
HFS_S3_PREFIX (none) Optional key prefix prepended to all S3 object keys
HFS_S3_ENDPOINT (AWS) S3-compatible endpoint URL (e.g. MinIO http://localhost:9000)
HFS_S3_FORCE_PATH_STYLE false Path-style addressing (required by MinIO and most S3-compatible providers)
HFS_S3_ALLOW_HTTP true Allow insecure http:// endpoint URLs (only relevant when HFS_S3_ENDPOINT is set)
HFS_S3_VALIDATE_BUCKETS true Validate bucket access on startup

Authentication & SMART-on-FHIR

JWT/bearer auth (HFS_AUTH_*, e.g. HFS_AUTH_ENABLED, HFS_AUTH_JWKS_URL, HFS_AUTH_ISSUER, HFS_AUTH_AUDIENCE) and SMART configuration endpoints (HFS_SMART_*) plus HFS_OUTBOUND_BEARER_TOKEN are documented in the helios-auth README.

Audit, subscriptions & bulk export

Variable Default Description
HFS_AUDIT_BACKEND none Audit backend: none, file, database, or cloudwatch
HFS_AUDIT_FILE_PATH (none) Required when HFS_AUDIT_BACKEND=file; NDJSON file path for persisted AuditEvent logs
HFS_AUDIT_DATABASE_URL (none) Optional dedicated audit database URL/path (SQLite/PostgreSQL/MongoDB families)
HFS_AUDIT_MONGODB_DATABASE (none) Optional dedicated MongoDB database name for audit events
HFS_AUDIT_S3_BUCKET (none) Optional dedicated S3 bucket for audit events
HFS_AUDIT_S3_PREFIX (none) Optional dedicated S3 prefix for audit events
HFS_AUDIT_S3_REGION (none) Optional dedicated S3 region for audit events
HFS_AUDIT_S3_VALIDATE_BUCKETS (none) Optional dedicated S3 bucket validation toggle for audit events
HFS_AUDIT_SOURCE_OBSERVER Device/hfs Sets AuditEvent.source.observer
HFS_AUDIT_EXCLUDE_PATHS (none) Comma-separated paths to exclude from audit middleware
HFS_AUDIT_CLOUDWATCH_LOG_GROUP (none) Required when HFS_AUDIT_BACKEND=cloudwatch; CloudWatch Logs log group name
HFS_AUDIT_CLOUDWATCH_LOG_STREAM hfs-audit CloudWatch Logs log stream name
HFS_AUDIT_CLOUDWATCH_REGION (AWS chain) AWS region override for CloudWatch Logs
HFS_SUBSCRIPTIONS_ENABLED false Enable the subscription engine when HFS is built with the subscriptions feature
HFS_SUBSCRIPTION_MESSAGING_ENABLED false Enable the FHIR Messaging subscription channel
HFS_SUBSCRIPTION_MESSAGE_SOURCE_ENDPOINT HFS_BASE_URL Source endpoint URL used in outbound FHIR message headers
HFS_SUBSCRIPTION_ALLOW_PRIVATE_ENDPOINTS false Allow subscription deliveries to private or loopback endpoints; intended for local development and CI only
HFS_BULK_EXPORT_ENABLED true Enable the Bulk Data Export $export operation; when false, all $export endpoints return 501
HFS_BULK_EXPORT_OUTPUT_BACKEND local-fs Bulk export output store: local-fs or s3. See the rest crate README for the full HFS_BULK_EXPORT_* table
HFS_BULK_SUBMIT_ENABLED true Enable the Bulk Data Submit $bulk-submit operation (HFS as Data Consumer — fetches a provider manifest and ingests it); when false, all $bulk-submit endpoints return 501. Available on sqlite/postgres (+ -elasticsearch composites). See the "Bulk Data Submit" section of CLAUDE.md for the full HFS_BULK_SUBMIT_* table
HFS_BULK_SUBMIT_OUTPUT_BACKEND local-fs Bulk submit status-artifact store: local-fs or s3

The SMTP delivery channel (HFS_SUBSCRIPTION_SMTP_*) and delivery-retry tuning (HFS_SUBSCRIPTION_HANDSHAKE_*) are documented in the helios-subscriptions README.

For detailed backend setup instructions (building from source, Docker commands, and search offloading architecture), see the persistence crate documentation.

Architecture Overview

The Helios FHIR Server is organized as a Rust workspace with modular components that can be used independently or together. Each component is designed for high performance and can be embedded directly into your data analytics pipeline.

Core Components

1. helios-hfs - Main Server Application

  • Executable:
    • hfs - The main Helios FHIR Server application.

2. helios-hts - FHIR Terminology Server

  • Executable:
    • hts - The Helios Terminology Server (default port 8090)

3. helios-fhir - FHIR Data Models

Generated from FHIR StructureDefinitions, type-safe Rust representations of all FHIR resources and data types.

  • Supports FHIR R4, R4B, R5, and R6 via feature flags
  • JSON serialization/deserialization with full FHIR compliance
  • Precision decimal handling for clinical accuracy
  • Default: R4 (use --all-features for all versions)

4. helios-fhir-gen - Code Generator

Generates the FHIR data models from official HL7 specifications.

  • Transforms FHIR StructureDefinitions into Rust types
  • Automatically downloads the latest R6 specs from the HL7 build server
  • See Code Generation section and helios-fhir-gen README for usage details

5. helios-fhirpath - FHIRPath Expression Engine

Complete implementation of the FHIRPath 3.0.0-ballot specification.

  • Executables:
    • fhirpath-cli - Evaluate FHIRPath expressions from the command line
    • fhirpath-server - HTTP server for FHIRPath evaluation
  • Parser built with chumsky for excellent error messages
  • Comprehensive function library with version-aware type checking
  • Auto-detects FHIR version from input data

6. helios-sof - SQL-on-FHIR Implementation

Transform FHIR resources into tabular data using ViewDefinitions.

  • Executables:
    • sof-cli - Command-line tool for batch transformations
    • sof-server - HTTP server with ViewDefinition/$viewdefinition-run operation
  • Supports multiple input formats: JSON, NDJSON, and FHIR Bundles from local/cloud storage
  • Supports multiple output formats: CSV, JSON, NDJSON, and Parquet

7. pysof - Python Bindings

Python bindings for SQL-on-FHIR using PyO3, bringing high-performance FHIR data transformation to Python.

Key Capabilities:

  • ViewDefinition Processing: Transform FHIR resources into tabular formats using ViewDefinitions
  • Multiple Output Formats: Export to CSV, JSON, NDJSON, and Parquet formats
  • Streaming Support: Efficiently process large FHIR bundles without loading everything into memory
  • Auto Version Detection: Automatically detects and handles R4, R4B, R5, and R6 FHIR versions
  • Type-Safe Interface: Leverages Rust's type safety while providing a Pythonic API
  • High Performance: Native Rust performance with minimal Python overhead

Python API Example:

import pysof

# Transform FHIR bundle to CSV using a ViewDefinition
result = pysof.run_view_definition(
    view_definition=view_def_json,
    bundle=fhir_bundle_json,
    format="csv"
)

# Process individual resources
result = pysof.run_view_definition(
    view_definition=view_def_json,
    resources=[patient1, patient2],
    format="parquet"
)

Distribution:

  • Cross-platform wheel distribution for Linux, Windows, and macOS available on PyPi

8. helios-cds-hooks - CDS Hooks Protocol Types

Rust types and traits for building CDS Hooks clinical decision support services.

  • Complete protocol types for discovery, requests, responses, cards, suggestions, and feedback
  • Strongly-typed context structs for all 10 hooks in the CDS Hooks Library
  • Async CdsHooksService trait compatible with any Rust web framework

9. helios-fhir-macro - Procedural Macros

Helper macros for code generation used by other components.

10. helios-fhirpath-support - Shared Utilities

Common types and traits for FHIRPath evaluation.

11. helios-persistence - Polyglot Persistence Layer

Storage backend abstraction supporting multiple database technologies optimized for different FHIR workloads. Also hosts the Bulk Data Export job-state stores, worker runtime, and output stores backing the server's $export operation (REST endpoints and HFS_BULK_EXPORT_* configuration live in helios-rest).

12. helios-audit - BALP Audit Logging

IHE BALP-compliant AuditEvent logging for REST, auth, persistence, and lifecycle events. Supports file, database, and AWS CloudWatch Logs sinks. See the helios-audit README for full configuration details.

Design Principles

  • Version Agnostic: All components use enum wrappers to seamlessly handle multiple FHIR versions
  • Modular Architecture: Each component can be used standalone or integrated
  • Type Safety: Leverages Rust's type system for compile-time correctness
  • Performance: Built for high-throughput clinical analytics workloads

Features

FHIR Version Support

  • FHIR R4 (4.0.1) - Default
  • FHIR R4B (4.3.0)
  • FHIR R5 (5.0.0)
  • FHIR R6 (6.0.0-ballot2)

FHIRPath Expression Language

  • Complete implementation of FHIRPath 3.0.0-ballot specification
  • 100+ built-in functions across all categories
  • HTTP server for integration with FHIRPath Lab
  • Command-line tool for testing and development

SQL-on-FHIR

  • ViewDefinition-based transformation to tabular formats
  • Multiple input formats: JSON, NDJSON (newline-delimited), and FHIR Bundles
  • Multiple output formats: CSV, JSON, NDJSON, Parquet
  • Streaming support for large datasets
  • HTTP API with $viewdefinition-run operation
  • Cloud storage support: S3, GCS, Azure Blob Storage

FHIR REST API

  • Full CRUD operations
  • Search with chained parameters
  • History and versioning
  • Batch/transaction support
  • Asynchronous Bulk Data Export ($export) at system, Patient, and Group level
  • Optional BALP-compliant AuditEvent logging for REST and auth interactions

Development

Building from Source

# Build with default features (R4 only)
# Note: pysof (Python bindings) is excluded by workspace default-members
cargo build

# Build with all FHIR versions
cargo build --all-features

# Build specific component
cargo build -p helios-fhirpath

# Build Python bindings (requires Python 3.11)
# Option A: Rust-only build of the crate
cargo build -p pysof

# Option B (recommended): build via maturin into a virtual env
cd crates/pysof
uv venv --python 3.11
uv sync
uv run maturin develop --release
# Build distributable artifacts for pysof
uv run maturin build --release -o dist   # wheels
uv run maturin sdist -o dist             # source distribution
# Build everything except Python bindings (alternative)
cargo build --workspace --exclude pysof

Running Tests

Docker Required: PostgreSQL and Elasticsearch integration tests use testcontainers to spin up real database instances in Docker. Make sure Docker is installed and running to execute the full test suite. To skip these tests (e.g., if Docker is unavailable), add -- --skip postgres_integration --skip es_integration to your test command.

# Run all tests (R4 only by default)
cargo test

# Run tests for all FHIR versions
cargo test --all-features

# Run tests without Docker (skips PostgreSQL and Elasticsearch integration tests)
cargo test --all-features -- --skip postgres_integration --skip es_integration

# Run specific test
cargo test test_name_pattern

# Run with output
cargo test -- --nocapture

Code Generation

To regenerate FHIR models from HL7 specifications:

# This will download the latest R6 (build) specifications from https://build.fhir.org/
# Note the lack of use of --all-features and the lack of skip-r6-download here.
cargo build -p helios-fhir-gen --features R4,R4B,R5,R6
# This will generate all FHIR code models (r4.rs, r4b.rs, r5, and r6) 
./target/debug/helios-fhir-gen --all
# Format the generated files accordingly
cargo fmt --all

Code Documentation

Published crate documentation is available on crates.io. To generate and view documentation locally:

cargo doc --no-deps --open

Roadmap

See our Roadmap for current development priorities and planned features.

Contributing

Please see our Contributing Guidelines for details.

License

The Helios FHIR Server is licensed under the MIT License.

Community

We welcome contributors and feedback at every level — from opening issues to joining design discussions.


HL7® and FHIR® are registered trademarks of Health Level Seven International.

@claude

claude Bot commented Jun 23, 2026

Copy link
Copy Markdown

Code review

1 issue found.


Bug: fan-out cap drops whole resource types based on alphabetical sort order

File: crates/persistence/src/sof/reference_resolver.rsStorageBackedResolver::resolve

// Cap fan-out before doing any I/O.
let capped = &refs[..refs.len().min(self.max_fanout)];
if capped.len() < refs.len() {
warn!(
requested = refs.len(),
cap = self.max_fanout,
"storage resolve(): reference fan-out exceeded cap; extra references left unresolved"

The cap is applied as a plain prefix slice of refs before grouping by type. The refs input arrives from collect_missing_references, which iterates a BTreeSet<String> — lexicographic order by the full "ResourceType/id" string. This means all references sort by resource type first ("Condition/…" < "Observation/…" < "Practitioner/…"), and when the cap is hit, every reference of a later-alphabetical type is silently dropped in its entirety rather than being thinned proportionally.

Concrete example: with max_fanout = 1000, 1,000 distinct Observation references + 1 Practitioner reference → the prefix slice keeps all 1,000 Observations, drops the Practitioner completely. The dropped reference falls back to stub/empty semantics (safe, no error), but the Practitioner referenced by the view is never resolved — producing silently incorrect output.

Suggested fix: cap per-type after grouping, not on the flat sorted list. For example, distribute the budget evenly across types so no single type crowds out others due to alphabetical sort order.

@codecov

codecov Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 97.40260% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/persistence/src/sof/emit.rs 87.50% 3 Missing ⚠️
crates/persistence/src/sof/reference_resolver.rs 97.24% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

@claude

claude Bot commented Jun 23, 2026

Copy link
Copy Markdown

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Architecture Overview

Workspace Structure

The project is a Rust workspace with 17 crates (16 default-members; pysof excluded from the default build):

Crate Description
helios-fhir Core FHIR data models (auto-generated). Supports R4, R4B, R5, R6 via feature flags.
helios-fhir-gen Code generator — produces Rust structs from FHIR JSON schemas. R6 specs auto-downloaded.
helios-fhir-macro Procedural macros for FHIR functionality.
helios-fhirpath FHIRPath expression language — parser (chumsky), evaluator, CLI tool, and HTTP server.
helios-fhirpath-support Shared support utilities for FHIRPath.
helios-serde JSON and XML serialization for FHIR resources (xml feature flag).
helios-serde-support Shared serde helpers.
helios-rest FHIR RESTful API layer (Axum) — handlers, middleware, extractors, multi-tenancy routing.
helios-persistence Polyglot persistence — backends (SQLite, PostgreSQL, Elasticsearch, MongoDB), composite storage, search registry, tenant isolation.
helios-hfs Main FHIR server binary. Combines helios-rest with storage backends.
helios-sof SQL-on-FHIR implementation — ViewDefinition processing, CLI and HTTP server.
helios-hts FHIR Terminology Server (HTS) — CodeSystem/ValueSet/ConceptMap operations and terminology import (SNOMED, LOINC, RxNorm, ICD-10-CM). Provides the hts binary.
helios-auth Authentication & authorization — SMART-on-FHIR / OAuth2 JWT bearer validation, JWKS, scopes, JTI replay cache. Configured via HFS_AUTH_*.
helios-audit Audit logging — FHIR AuditEvent with IHE BALP profiles; pluggable sinks (database, file, CloudWatch, S3). Configured via HFS_AUDIT_*.
helios-subscriptions FHIR topic-based Subscriptions engine — rest-hook, websocket, email, and messaging channels. Configured via HFS_SUBSCRIPTION(S)_*.
helios-cds-hooks CDS Hooks protocol types and async service trait (HL7 CDS Hooks v3.0.0-ballot). Standalone library.
pysof Python bindings (PyO3/maturin) for SQL-on-FHIR. Excluded from default workspace build.

Binaries

Binary Crate Description
hfs helios-hfs FHIR server
fhirpath-cli helios-fhirpath FHIRPath expression evaluator CLI
fhirpath-server helios-fhirpath FHIRPath HTTP evaluation server
sof-cli helios-sof SQL-on-FHIR CLI tool
sof-server helios-sof SQL-on-FHIR HTTP server
config-advisor helios-persistence Storage configuration advisor
hts helios-hts FHIR Terminology Server (HTS)

Key Design Patterns

Version-Agnostic Abstraction

The codebase uses enum wrappers and traits to handle multiple FHIR versions:

// Example from sof crate
pub enum SofViewDefinition {
    R4(fhir::r4::ViewDefinition),
    R4B(fhir::r4b::ViewDefinition),
    R5(fhir::r5::ViewDefinition),
    R6(fhir::r6::ViewDefinition),
}

Trait-Based Processing

Core functionality is defined through traits, allowing version-independent logic:

  • ViewDefinitionTrait, BundleTrait, ResourceTrait (SOF)
  • ResourceStorage, VersionedStorage, SearchProvider, Transaction (persistence)

Persistence Trait Hierarchy

Storage backends implement a progressive trait hierarchy:

ResourceStorage → VersionedStorage → InstanceHistoryProvider → TypeHistoryProvider → SystemHistoryProvider
ResourceStorage → SearchProvider → MultiTypeSearchProvider / ChainedSearchProvider / IncludeProvider
ResourceStorage → TransactionProvider → BundleProvider

Tenant-First Design

All persistence operations take a TenantContext as the first argument, ensuring data isolation. Every storage backend enforces tenant boundaries at the query level.

Composite Storage

The CompositeStorage pattern combines backends (e.g., SQLite for CRUD + Elasticsearch for search) behind a single interface. Configured via HFS_STORAGE_BACKEND.

Project Skills

Detailed operational guidance lives in project skills under .claude/skills/.
Use those skills instead of expanding this always-loaded file:

  • /run-hfs-server - HFS server runtime, storage backends, multi-tenancy, compression, and API endpoints.
  • /work-with-fhirpath - FHIRPath CLI, server, expressions, terminology integration, and tests.
  • /work-with-sof - SQL-on-FHIR, ViewDefinition processing, sof-cli, sof-server, and parquet output.
  • /work-with-pysof - Python bindings under crates/pysof, maturin setup, API usage, and pysof tests.
  • /test-hfs - Test strategy, testcontainers, persistence integration tests, and shared test data.
  • /work-with-hts - Terminology server configuration, APIs, bootstrap sync, and terminology imports.
  • /work-with-auth - Authentication/authorization, SMART-on-FHIR, JWT/JWKS, scopes, JTI cache, and HFS_AUTH_* config.
  • /work-with-audit - FHIR AuditEvent logging, IHE BALP, audit sinks, and HFS_AUDIT_* config.
  • /work-with-subscriptions - Topic-based Subscriptions engine, channels (rest-hook/websocket/email/messaging), and config.
  • /work-with-cds-hooks - CDS Hooks protocol types and async service trait for clinical decision support.
  • /bulk-data-export - FHIR Bulk Data Access $export jobs, manifests, output storage, and behavior notes.
  • /bulk-data-submit - FHIR Bulk Data Submit $bulk-submit ingestion, status, OAuth, JWE, and worker settings.
  • /docker-and-release - Docker image builds and release workflow.

Environment Setup

LLD Linker Configuration

Add to ~/.cargo/config.toml:

[target.x86_64-unknown-linux-gnu]
linker = "clang"
rustflags = ["-C", "link-arg=-fuse-ld=lld"]

Memory-Constrained Builds

export CARGO_BUILD_JOBS=4

Debugging Tips

  • Use cargo test -- --nocapture to see println! output
  • Enable trace logging: RUST_LOG=trace cargo run
  • FHIRPath expressions can be tested independently via CLI
  • HFS server: HFS_LOG_LEVEL=debug cargo run --bin hfs

Important Notes

  • Default FHIR version is R4 when no features specified
  • FHIR version feature assumption: Code MAY assume that at least one FHIR version feature is enabled at compile time, and SHOULD assume R4 is enabled when relying on FhirVersion::default() (which is gated on feature = "R4"). Avoid adding cfg-ladder fallbacks for the "no version enabled" case — that build target is not supported. Single-version minimal builds (e.g. R4B-only) are supported, but functions that need a default value should require R4 explicitly rather than enumerating versions in #[cfg] arms.
  • The project follows standard Rust conventions
  • pysof is excluded from default workspace members — cargo build from root skips it
  • Server returns appropriate HTTP status codes and FHIR OperationOutcomes for errors
  • Minimum supported Rust version: 1.90 (edition 2024)

@claude

claude Bot commented Jun 23, 2026

Copy link
Copy Markdown

AGENTS.md

This file provides guidance to Codex when working with code in this repository.

Architecture Overview

Workspace Structure

The project is a Rust workspace with 17 crates (16 default-members; pysof excluded from the default build):

Crate Description
helios-fhir Core FHIR data models (auto-generated). Supports R4, R4B, R5, R6 via feature flags.
helios-fhir-gen Code generator - produces Rust structs from FHIR JSON schemas. R6 specs auto-downloaded.
helios-fhir-macro Procedural macros for FHIR functionality.
helios-fhirpath FHIRPath expression language - parser (chumsky), evaluator, CLI tool, and HTTP server.
helios-fhirpath-support Shared support utilities for FHIRPath.
helios-serde JSON and XML serialization for FHIR resources (xml feature flag).
helios-serde-support Shared serde helpers.
helios-rest FHIR RESTful API layer (Axum) - handlers, middleware, extractors, multi-tenancy routing.
helios-persistence Polyglot persistence - backends (SQLite, PostgreSQL, Elasticsearch, MongoDB), composite storage, search registry, tenant isolation.
helios-hfs Main FHIR server binary. Combines helios-rest with storage backends.
helios-sof SQL-on-FHIR implementation - ViewDefinition processing, CLI and HTTP server.
helios-hts FHIR Terminology Server (HTS) - CodeSystem/ValueSet/ConceptMap operations and terminology import (SNOMED, LOINC, RxNorm, ICD-10-CM). Provides the hts binary.
helios-auth Authentication & authorization - SMART-on-FHIR / OAuth2 JWT bearer validation, JWKS, scopes, JTI replay cache. Configured via HFS_AUTH_*.
helios-audit Audit logging - FHIR AuditEvent with IHE BALP profiles; pluggable sinks (database, file, CloudWatch, S3). Configured via HFS_AUDIT_*.
helios-subscriptions FHIR topic-based Subscriptions engine - rest-hook, websocket, email, and messaging channels. Configured via HFS_SUBSCRIPTION(S)_*.
helios-cds-hooks CDS Hooks protocol types and async service trait (HL7 CDS Hooks v3.0.0-ballot). Standalone library.
pysof Python bindings (PyO3/maturin) for SQL-on-FHIR. Excluded from default workspace build.

Binaries

Binary Crate Description
hfs helios-hfs FHIR server
fhirpath-cli helios-fhirpath FHIRPath expression evaluator CLI
fhirpath-server helios-fhirpath FHIRPath HTTP evaluation server
sof-cli helios-sof SQL-on-FHIR CLI tool
sof-server helios-sof SQL-on-FHIR HTTP server
config-advisor helios-persistence Storage configuration advisor
hts helios-hts FHIR Terminology Server (HTS)

Key Design Patterns

Version-Agnostic Abstraction

The codebase uses enum wrappers and traits to handle multiple FHIR versions:

// Example from sof crate
pub enum SofViewDefinition {
    R4(fhir::r4::ViewDefinition),
    R4B(fhir::r4b::ViewDefinition),
    R5(fhir::r5::ViewDefinition),
    R6(fhir::r6::ViewDefinition),
}

Trait-Based Processing

Core functionality is defined through traits, allowing version-independent logic:

  • ViewDefinitionTrait, BundleTrait, ResourceTrait (SOF)
  • ResourceStorage, VersionedStorage, SearchProvider, Transaction (persistence)

Persistence Trait Hierarchy

Storage backends implement a progressive trait hierarchy:

ResourceStorage -> VersionedStorage -> InstanceHistoryProvider -> TypeHistoryProvider -> SystemHistoryProvider
ResourceStorage -> SearchProvider -> MultiTypeSearchProvider / ChainedSearchProvider / IncludeProvider
ResourceStorage -> TransactionProvider -> BundleProvider

Tenant-First Design

All persistence operations take a TenantContext as the first argument, ensuring data isolation. Every storage backend enforces tenant boundaries at the query level.

Composite Storage

The CompositeStorage pattern combines backends (e.g., SQLite for CRUD + Elasticsearch for search) behind a single interface. Configured via HFS_STORAGE_BACKEND.

Project Skills

Detailed operational guidance lives in Codex project skills under .agents/skills/.
Use those skills instead of expanding this always-loaded file:

  • $run-hfs-server - HFS server runtime, storage backends, multi-tenancy, compression, and API endpoints.
  • $work-with-fhirpath - FHIRPath CLI, server, expressions, terminology integration, and tests.
  • $work-with-sof - SQL-on-FHIR, ViewDefinition processing, sof-cli, sof-server, and parquet output.
  • $work-with-pysof - Python bindings under crates/pysof, maturin setup, API usage, and pysof tests.
  • $test-hfs - Test strategy, testcontainers, persistence integration tests, and shared test data.
  • $work-with-hts - Terminology server configuration, APIs, bootstrap sync, and terminology imports.
  • $work-with-auth - Authentication/authorization, SMART-on-FHIR, JWT/JWKS, scopes, JTI cache, and HFS_AUTH_* config.
  • $work-with-audit - FHIR AuditEvent logging, IHE BALP, audit sinks, and HFS_AUDIT_* config.
  • $work-with-subscriptions - Topic-based Subscriptions engine, channels (rest-hook/websocket/email/messaging), and config.
  • $work-with-cds-hooks - CDS Hooks protocol types and async service trait for clinical decision support.
  • $bulk-data-export - FHIR Bulk Data Access $export jobs, manifests, output storage, and behavior notes.
  • $bulk-data-submit - FHIR Bulk Data Submit $bulk-submit ingestion, status, OAuth, JWE, and worker settings.
  • $docker-and-release - Docker image builds and release workflow.

Environment Setup

LLD Linker Configuration

Add to ~/.cargo/config.toml:

[target.x86_64-unknown-linux-gnu]
linker = "clang"
rustflags = ["-C", "link-arg=-fuse-ld=lld"]

Memory-Constrained Builds

export CARGO_BUILD_JOBS=4

Debugging Tips

  • Use cargo test -- --nocapture to see println! output
  • Enable trace logging: RUST_LOG=trace cargo run
  • FHIRPath expressions can be tested independently via CLI
  • HFS server: HFS_LOG_LEVEL=debug cargo run --bin hfs

Important Notes

  • Default FHIR version is R4 when no features specified
  • FHIR version feature assumption: Code MAY assume that at least one FHIR version feature is enabled at compile time, and SHOULD assume R4 is enabled when relying on FhirVersion::default() (which is gated on feature = "R4"). Avoid adding cfg-ladder fallbacks for the "no version enabled" case - that build target is not supported. Single-version minimal builds (e.g. R4B-only) are supported, but functions that need a default value should require R4 explicitly rather than enumerating versions in #[cfg] arms.
  • The project follows standard Rust conventions
  • pysof is excluded from default workspace members - cargo build from root skips it
  • Server returns appropriate HTTP status codes and FHIR OperationOutcomes for errors
  • Minimum supported Rust version: 1.90 (edition 2024)

…2026-0185

Two ambient CI breakages on origin/main, unrelated to this PR's changes:

- clippy 1.91 newly flags collapsible_else_if in sof/emit.rs; collapse the
  nested else { if .. } into else if.
- cargo audit fails on RUSTSEC-2026-0185 (quinn-proto remote memory
  exhaustion), a transitive reqwest QUIC dep. We never accept inbound QUIC,
  so the reassembly path is unreachable; ignore it with justification.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FHIRPath resolve(): add storage-backed resolution for server-stored resources

1 participant