Skip to content

PERF: WritePoleFigure requires O(N) contiguous Euler buffer per phase #1593

@joeykleingers

Description

@joeykleingers

Summary

WritePoleFigureFilter allocates a single contiguous ebsdlib::FloatArrayType buffer sized to all cells of the current phase before handing it to ebsdlib's pole-figure computation. On very large datasets this buffer dominates the filter's memory footprint because ebsdlib's pole-figure kernels require the full buffer up front — there is no streaming, chunked, or callback interface.

Current behavior

In src/Plugins/OrientationAnalysis/src/OrientationAnalysis/Filters/Algorithms/WritePoleFigure.cpp on develop (as of 9b667ae0b), the per-phase loop around lines 565–609 does:

  1. Walk every cell and count cells matching the current phase (+ optional mask) → count.
  2. Allocate a flat buffer:
    const ebsdlib::FloatArrayType::Pointer subEulerAnglesPtr =
        ebsdlib::FloatArrayType::CreateArray(count, eulerCompDim, "Euler_Angles_Per_Phase", true);
    subEulerAnglesPtr->initializeWithValue(std::numeric_limits<float>::signaling_NaN());
    sized count × 3 × sizeof(float).
  3. Walk every cell again and densify matching Eulers into that buffer.
  4. Pass the raw pointer through PoleFigureConfiguration_t::eulers to the downstream ebsdlib kernels:
    ebsdlib::PoleFigureConfiguration_t config;
    config.eulers = subEulerAnglesPtr.get();
    // ...
    figures = makePoleFigures<ebsdlib::CubicOps>(config);
    intensityImages = createIntensityPoleFigures<ebsdlib::CubicOps>(config, m_InputValues->NormalizeToMRD);

makePoleFigures and createIntensityPoleFigures (both defined in ebsdlib) walk config.eulers linearly. They don't expose a streaming, chunked, or callback interface, so the simplnx caller has to materialize the full buffer before invoking them.

Impact

For a phase containing 500M cells, step 2 allocates ~6 GB. The peak heap is bounded by max_over_phases(cells_in_phase) × 12 bytes, regardless of whether the filter reads its inputs lazily or streams its output images. On datasets that approach or exceed available RAM, this allocation is the hard ceiling.

Required changes

Resolving this requires a new API in BlueQuartzSoftware/ebsdlib first. Three candidate directions:

  1. Streaming pole-figure kernel — add an overload of makePoleFigures / createIntensityPoleFigures that accepts a callback or iterator yielding (count, const float* chunk) pairs and incrementally updates the pole-figure bins.
  2. Memory-mapped backing for PoleFigureConfiguration_t::eulers — allow the buffer to be backed by a memory-mapped file so the ebsdlib kernel's linear walk is served by file cache instead of anonymous RAM.
  3. Per-chunk accumulator — expose the pole-figure bin buffers as a public accumulator type so callers iterate their own chunks of Eulers and call accumulate(eulers_chunk) then finalize().

(1) and (3) are idiomatic. (2) is the lowest-effort if mmap semantics fit the access pattern.

Once the ebsdlib side exists, WritePoleFigure.cpp would be rewritten to iterate chunks of phase + Eulers directly into the new API, eliminating the subEulerAnglesPtr allocation entirely.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions