From b6d8c36f218b5302d98f9a4e0191c1703384eb2c Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 10:11:51 +0200 Subject: [PATCH 01/15] Add CategoryOwner ADR and migration plan --- .../adr_category-owner-sections.md | 298 ++++++ docs/dev/plan_category-owner-sections.md | 950 ++++++++++++++++++ 2 files changed, 1248 insertions(+) create mode 100644 docs/dev/ADR-suggestions/adr_category-owner-sections.md create mode 100644 docs/dev/plan_category-owner-sections.md diff --git a/docs/dev/ADR-suggestions/adr_category-owner-sections.md b/docs/dev/ADR-suggestions/adr_category-owner-sections.md new file mode 100644 index 00000000..8778af48 --- /dev/null +++ b/docs/dev/ADR-suggestions/adr_category-owner-sections.md @@ -0,0 +1,298 @@ +# ADR: Category Owners and Real Datablocks + +**Status:** Proposed +**Date:** 2026-05-17 + +## Context + +The current architecture has two real datablock families: + +- structures +- experiments + +These are real datablocks because each instance maps to a CIF `data_` +block and has a datablock entry name. This is reflected in +`DatablockItem`, `DatablockCollection`, and `datablock_item_to_cif()`. + +`Analysis` and project-level configuration are different. They own +CIF-like categories, but they are singleton project sections rather than +collections of independently named data blocks. They should not need a +datablock ID, and they should not serialize as `data_analysis` or +`data_project`. + +The problem is that `DatablockItem` currently combines two +responsibilities: + +1. It owns and updates flat categories. +2. It represents a real CIF data block with a `data_` header. + +This makes it tempting to make `Analysis` inherit from `DatablockItem` +only to reuse category discovery, parameter enumeration, update +orchestration, and CIF category serialization. That would improve code +reuse, but it would weaken the meaning of "datablock" in the model. + +The open issue "Make `Analysis` a `DatablockItem`" already identifies +the design fork: either make `Analysis` inherit from `DatablockItem`, or +extract a shared category-update protocol. + +A detailed migration plan exists in: + +```text +docs/dev/plan_category-owner-sections.md +``` + +This ADR records the intended architectural decision behind that plan. + +## Decision + +Introduce a new core abstraction named `CategoryOwner`. + +`CategoryOwner` represents an object that owns flat CIF-like categories. +It provides shared behavior for: + +- category discovery +- category sorting by `_update_priority` +- parameter aggregation +- category update orchestration +- dirty-flag tracking +- category help display +- category-body CIF serialization without a `data_` header + +`DatablockItem` will inherit from `CategoryOwner` and add only the +behavior specific to real CIF data blocks: + +- `unique_name` from `datablock_entry_name` +- `data_` header serialization +- participation in `DatablockCollection` + +The target hierarchy is: + +```text +GuardedBase +|-- CategoryItem +|-- CollectionBase +| |-- CategoryCollection +| `-- DatablockCollection +`-- CategoryOwner + |-- DatablockItem + | |-- Structure + | `-- ExperimentBase + |-- Analysis + `-- ProjectConfig # optional follow-up +``` + +## Detailed Rules + +### 1. A real datablock must emit a CIF `data_` header + +Only objects that serialize as independent CIF data blocks should inherit +datablock-specific behavior. + +Current real datablocks: + +- `Structure` +- `ExperimentBase` subclasses + +These should continue to emit: + +```text +data_ +data_ +``` + +### 2. `Analysis` is a category-owning singleton section + +`Analysis` should inherit from `CategoryOwner`, not `DatablockItem`. + +It owns categories such as: + +- `fitting` +- `aliases` +- `constraints` +- `joint_fit` +- `sequential_fit` +- `sequential_fit_extract` + +It should reuse shared category discovery and parameter aggregation, but +its saved CIF should remain a section body. It must not emit +`data_analysis`. + +### 3. Project configuration may become a category-owning singleton section + +Project-level configuration should eventually follow the same model. + +The likely long-term shape is: + +```text +Project +`-- ProjectConfig(CategoryOwner) + |-- ProjectInfo or ProjectMetadata + `-- Rendering +``` + +This should be treated as a follow-up after the `Analysis` migration is +stable, because project save/load compatibility is a separate risk. + +### 4. CIF serialization should be split by responsibility + +Add a serializer for category-owner bodies: + +```python +category_owner_to_cif(owner) +``` + +This function serializes categories but does not add a `data_` header. + +Keep `datablock_item_to_cif(datablock)` for real datablocks. It should +compose: + +1. `data_` +2. `category_owner_to_cif(datablock)` + +### 5. Active analysis categories remain an owner-level policy + +`Analysis` has mode-specific sibling categories. They remain direct +children of `Analysis`, not nested under `fitting`. + +`Analysis` should expose a hook such as: + +```python +def _serializable_categories(self) -> list: + ... +``` + +This hook controls which categories are written for the current fitting +mode: + +- shared categories always serialize +- `joint_fit` serializes in joint mode +- `sequential_fit` and `sequential_fit_extract` serialize in sequential + mode +- inactive mode-specific categories remain accessible but are not + serialized + +This preserves the active-sibling selector model accepted in +`docs/dev/ADRs/adr_fit-mode-categories.md`. + +## Consequences + +### Positive + +- The term "datablock" remains precise. +- `Analysis` can reuse standard category discovery and parameter + enumeration without becoming a fake data block. +- CIF serialization becomes clearer because category-body rendering is + separated from `data_` header rendering. +- Future project-level category sections can reuse the same base class. +- Dirty-flag behavior can be generalized from `DatablockItem` to + `CategoryOwner`. + +### Trade-offs + +- A new core abstraction must be introduced and documented. +- `DatablockItem` must be refactored without changing structure and + experiment behavior. +- `Analysis` migration touches both runtime behavior and CIF + serialization, so it needs characterization tests. +- Project configuration cleanup should be delayed to avoid mixing two + save/load migrations. + +### Compatibility Requirements + +The migration must preserve: + +- `structure.as_cif` starts with `data_` +- `experiment.as_cif` starts with `data_` +- `analysis.as_cif` does not start with `data_` +- project save/load layout remains compatible +- `project.parameters` remains fit-focused and does not include analysis + configuration parameters unless a later ADR changes that contract + +## Alternatives Considered + +### Make `Analysis` inherit from `DatablockItem` + +Rejected. + +This would be the smallest code change, but it would make "datablock" +mean both real CIF data blocks and singleton project sections. It would +also encourage fake identities such as `datablock_entry_name = +"analysis"`. + +### Add `emit_data_header = False` to `DatablockItem` + +Rejected. + +This keeps inheritance reuse but encodes two different concepts in one +class. It also makes every future datablock-related method check whether +the object is a real datablock. + +### Keep `Analysis` fully ad hoc + +Rejected. + +This preserves current behavior but leaves duplicated logic for category +discovery, category updates, parameter enumeration, and CIF section +serialization. + +### Make `Project` itself a `CategoryOwner` + +Deferred. + +`Project` is a facade that owns collections, display helpers, summary +helpers, analysis, and file I/O. Making the facade itself a category +owner would mix orchestration with category-section behavior. A smaller +`ProjectConfig(CategoryOwner)` is a better follow-up. + +## Implementation Plan + +Follow: + +```text +docs/dev/plan_category-owner-sections.md +``` + +The plan should be implemented in small phases: + +1. Add baseline tests. +2. Add `CategoryOwner`. +3. Make `DatablockItem` inherit `CategoryOwner`. +4. Split category-owner CIF body serialization from datablock + serialization. +5. Move `Analysis` onto `CategoryOwner`. +6. Generalize dirty-flag lookup from `DatablockItem` to `CategoryOwner`. +7. Optionally introduce `ProjectConfig`. +8. Update architecture and issue tracking. + +## Post-Implementation ADR Update + +This ADR must be updated after the migration plan is implemented. + +When implementation is complete: + +1. Change status from `Proposed` to `Accepted and implemented`. +2. Update the date if the project convention requires the implementation + date. +3. Replace tentative wording such as "should" and "target hierarchy" with + the actual final design. +4. Record any deviations from the migration plan. +5. Link to the implementation PR or commit if available. +6. Move this file from `docs/dev/ADR-suggestions/` to `docs/dev/ADRs/` if + that is the repository convention for accepted decisions. +7. Update `docs/dev/architecture.md`. +8. Update or close the related issue in `docs/dev/Issues/issues_open.md` + (move to `docs/dev/Issues/issues_closed.md` on full resolution). + +## Acceptance Criteria + +This ADR is satisfied when: + +- `CategoryOwner` exists and is documented. +- `DatablockItem` inherits from `CategoryOwner`. +- `Analysis` inherits from `CategoryOwner`, not `DatablockItem`. +- `Analysis` uses shared category discovery and parameter enumeration. +- real datablocks still emit `data_` headers. +- singleton sections do not emit fake `data_` headers. +- dirty-flag propagation works for all category owners. +- architecture documentation distinguishes real datablocks from + category-owning sections. diff --git a/docs/dev/plan_category-owner-sections.md b/docs/dev/plan_category-owner-sections.md new file mode 100644 index 00000000..2c93ea9b --- /dev/null +++ b/docs/dev/plan_category-owner-sections.md @@ -0,0 +1,950 @@ +# Category Owner Migration Plan + +## Status + +Branch: `feature/category-owner-sections` + +Two-phase workflow (see `.github/copilot-instructions.md`): + +- Phase 1 — Implementation. Code, docs, and architecture updates only. + Phase 0 baseline characterization tests below are an explicit + exception (they describe pre-existing behavior so the refactor can be + verified mechanically). Do not add new feature tests during Phase 1. +- Phase 2 — Verification. Add the per-phase tests listed below, then + run the verification commands in the "Phase 2 Verification" section. + +Stop after Phase 1 and request review before starting Phase 2. + +Status checklist (mark `[x]` as completed): + +```text +Phase 1 — Implementation +[ ] Phase 0: Add baseline characterization tests. +[ ] Phase 1: Add CategoryOwner in core/category_owner.py. +[ ] Phase 2: Make DatablockItem inherit CategoryOwner. +[ ] Phase 3: Split CIF body serialization (category_owner_to_cif). +[ ] Phase 4: Move Analysis onto CategoryOwner. +[ ] Phase 5: Update dirty-flag lookup to CategoryOwner. +[ ] Phase 6: (Optional) ProjectConfig cleanup. +[ ] Phase 7: Update architecture.md and Issues/issues_open.md. +[ ] Phase 1 review gate: present diff for approval. + +Phase 2 — Verification +[ ] Add per-phase unit tests (see "Tests For Phase X" sections). +[ ] pixi run fix +[ ] pixi run check +[ ] pixi run unit-tests +[ ] pixi run integration-tests +[ ] pixi run script-tests +``` + +## Commit Discipline + +When an AI agent follows this plan, every completed Phase 1 +implementation step listed in the status checklist must be staged with +explicit paths and committed locally before moving to the next +implementation step or to the Phase 1 review gate. Follow the rules in +the **Commits** section of `.github/copilot-instructions.md`. + +- One commit per step. Atomic and single-purpose. +- Stage explicit paths only — do not `git add .`. +- Suggested commit messages (≤72 chars, imperative mood, no type prefix): + - `Add baseline category-owner characterization tests` + - `Add CategoryOwner base class` + - `Make DatablockItem inherit CategoryOwner` + - `Add category_owner_to_cif and reuse it in datablock serializer` + - `Make Analysis inherit CategoryOwner` + - `Generalize dirty-flag lookup to CategoryOwner` + - `Introduce ProjectConfig category owner` (optional) + - `Document CategoryOwner in architecture.md` +- Do not stage generated artifacts produced by integration/script/ + notebook tests unless explicitly asked. + +## Purpose + +This plan explains how to make `Analysis` and project-level configuration +reuse the same category-management behavior as real datablocks without making +them real datablocks. + +The current code has two real datablock types: + +- `Structure` +- `Experiment` + +Those are real datablocks because they represent CIF `data_` blocks and +have a user-defined block ID. + +`Analysis` and project-level configuration are different. They own CIF-like +categories, but they are singleton project sections. They should not require a +datablock ID and should not serialize as `data_analysis` or `data_project`. + +The target design is: + +```text +GuardedBase +|-- CategoryItem +|-- CollectionBase +| |-- CategoryCollection +| `-- DatablockCollection +`-- CategoryOwner + |-- DatablockItem + | |-- Structure + | `-- ExperimentBase + |-- Analysis + `-- ProjectConfig # optional follow-up +``` + +`CategoryOwner` contains shared behavior for objects that own flat categories. +`DatablockItem` keeps the additional behavior that only real CIF data blocks +need. + +## Important Definitions + +Use these terms consistently during the migration. + +`DatablockItem` +: A real CIF data block. It serializes with a `data_` header. It has a + `datablock_entry_name`. Examples: `Structure`, `ExperimentBase`. + +`CategoryOwner` +: An object that owns flat sibling categories and can update, enumerate, and + serialize those categories. It does not necessarily have a `data_` header. + Examples after migration: `DatablockItem`, `Analysis`, optional + `ProjectConfig`. + +`CategoryItem` +: A single category row. Example: `Cell`, `SpaceGroup`, `Fitting`, + `Rendering`. + +`CategoryCollection` +: A loop-style category collection. Example: `AtomSites`, `Aliases`, + `JointFitCollection`. + +## Non-Negotiable Rules + +Follow these rules throughout the migration: + +1. Do not make `Analysis` inherit directly from `DatablockItem`. +2. Do not make `Project` inherit from `DatablockItem`. +3. Do not emit `data_analysis` from `Analysis.as_cif`. +4. Do not emit `data_project` in the saved `project.cif` file. +5. Do not rename CIF tags during this migration. +6. Do not change public access paths unless a phase explicitly says so. +7. Keep `project.parameters` limited to fit-relevant structure and experiment + parameters unless a separate design decision changes that later. +8. Keep inactive analysis categories accessible unless the fit-mode policy is + intentionally changed in a separate task. + +## Recommended Branching Strategy + +Use several small pull requests or commits. Each phase should be independently +reviewable. + +Suggested split: + +1. Characterization tests. +2. Add `CategoryOwner`. +3. Move `DatablockItem` onto `CategoryOwner`. +4. Split CIF body serialization. +5. Move `Analysis` onto `CategoryOwner`. +6. Update dirty-flag lookup. +7. Optional project-config cleanup. +8. Documentation cleanup. + +Do not combine phases 4, 5, and 6 into one large change. If something breaks, +small phases make the source of the break obvious. + +## Phase 0: Baseline Safety Tests + +Before changing production code, add or confirm tests that describe current +behavior. + +### Files To Read First + +Read these files before editing: + +- `src/easydiffraction/core/datablock.py` +- `src/easydiffraction/core/category.py` +- `src/easydiffraction/core/collection.py` +- `src/easydiffraction/core/identity.py` +- `src/easydiffraction/core/variable.py` +- `src/easydiffraction/analysis/analysis.py` +- `src/easydiffraction/project/project.py` +- `src/easydiffraction/io/cif/serialize.py` +- `docs/dev/architecture.md` +- `docs/dev/Issues/issues_open.md` + +### Tests To Add Or Confirm + +Add focused tests before refactoring. Good locations are: + +- `tests/unit/easydiffraction/core/` +- `tests/unit/easydiffraction/analysis/` +- `tests/unit/easydiffraction/io/cif/` + +Test these behaviors: + +1. A structure CIF starts with `data_`. +2. An experiment CIF starts with `data_`. +3. `Analysis.as_cif` does not start with `data_`. +4. `Analysis.as_cif` includes `_fitting.mode_type`. +5. `Analysis.as_cif` includes `fitting`, `aliases`, and `constraints` + sections when present. +6. In joint mode, `Analysis.as_cif` includes `joint_fit`. +7. In sequential mode, `Analysis.as_cif` includes `sequential_fit` and + `sequential_fit_extract`. +8. In single mode, inactive mode-specific sections are not serialized. +9. Existing project save/load behavior still reads: + - `project.cif` + - `structures/*.cif` + - `experiments/*.cif` + - `analysis/analysis.cif` + +### Suggested Commands + +Run focused tests first: + +```shell +pixi run unit-tests tests/unit/easydiffraction/core +pixi run unit-tests tests/unit/easydiffraction/analysis +pixi run unit-tests tests/unit/easydiffraction/io/cif +``` + +If these pass, run a broader unit subset: + +```shell +pixi run unit-tests tests/unit/easydiffraction/datablocks +``` + +## Phase 1: Add `CategoryOwner` + +Create: + +```text +src/easydiffraction/core/category_owner.py +``` + +### Responsibility + +`CategoryOwner` should own the behavior currently shared by any object that has +flat categories: + +- category discovery +- category sorting by `_update_priority` +- parameter aggregation +- category update orchestration +- dirty flag +- category table in `help()` +- CIF body serialization support, without a `data_` header + +### Initial Implementation Shape + +Start with this shape. Adjust imports as needed. + +```python +from __future__ import annotations + +from easydiffraction.core.category import CategoryCollection +from easydiffraction.core.category import CategoryItem +from easydiffraction.core.guard import GuardedBase + + +class CategoryOwner(GuardedBase): + """Base class for objects that own flat CIF-like categories.""" + + def __init__(self) -> None: + super().__init__() + self._need_categories_update = True + + @property + def categories(self) -> list: + """All category objects owned by this object, sorted by priority.""" + cats = [ + v + for v in vars(self).values() + if isinstance(v, (CategoryItem, CategoryCollection)) + ] + return sorted(cats, key=lambda c: type(c)._update_priority) + + def _serializable_categories(self) -> list: + """Categories that should be serialized for this owner.""" + return self.categories + + @property + def parameters(self) -> list: + """All parameters from all owned categories.""" + params = [] + for category in self.categories: + params.extend(category.parameters) + return params + + def _update_categories( + self, + *, + called_by_minimizer: bool = False, + ) -> None: + """Run update hooks on all owned categories.""" + if not called_by_minimizer and not self._need_categories_update: + return + + for category in self.categories: + category._update(called_by_minimizer=called_by_minimizer) + + self._need_categories_update = False +``` + +### Help Method + +Move the category-summary part of `DatablockItem.help()` into +`CategoryOwner.help()`. + +Keep it simple: + +1. Call `super().help()`. +2. Render a table of category code, type name, and parameter count. + +### Tests For Phase 1 + +Create fake category owner tests. Do not use `Structure` or `Analysis` yet. + +Test: + +1. `CategoryOwner.categories` finds direct `CategoryItem` attributes. +2. `CategoryOwner.categories` finds direct `CategoryCollection` attributes. +3. Categories are sorted by `_update_priority`. +4. `CategoryOwner.parameters` aggregates parameters from categories. +5. `_update_categories()` calls category `_update()` methods. +6. `_update_categories()` honors `_need_categories_update`. +7. `_serializable_categories()` returns `categories` by default. + +## Phase 2: Make `DatablockItem` Inherit `CategoryOwner` + +Edit: + +```text +src/easydiffraction/core/datablock.py +``` + +### What To Change + +Change: + +```python +class DatablockItem(GuardedBase): +``` + +to: + +```python +class DatablockItem(CategoryOwner): +``` + +Import `CategoryOwner` from the new module. + +Remove from `DatablockItem` any code now provided by `CategoryOwner`: + +- `_need_categories_update` initialization +- generic `_update_categories()` +- `categories` +- `parameters` +- category-summary part of `help()`, if moved + +Keep these in `DatablockItem`: + +- `unique_name` +- `as_cif` +- `_cif_for_display` +- datablock-specific string representations + +### Expected Result + +This phase should not change public behavior. + +These should still work: + +```python +structure.as_cif +experiment.as_cif +structure.categories +experiment.categories +structure.parameters +experiment.parameters +``` + +### Tests For Phase 2 + +Run: + +```shell +pixi run unit-tests tests/unit/easydiffraction/core +pixi run unit-tests tests/unit/easydiffraction/datablocks +pixi run unit-tests tests/unit/easydiffraction/io/cif +``` + +If failures happen, check parent linkage first. Most failures in this phase are +likely caused by categories not having the expected `_parent`. + +## Phase 3: Split CIF Body Serialization From Datablock Serialization + +Edit: + +```text +src/easydiffraction/io/cif/serialize.py +``` + +### Problem + +`datablock_item_to_cif()` currently does two jobs: + +1. Adds the `data_` header. +2. Serializes category content. + +Only real datablocks need job 1. `Analysis` and project config need job 2. + +### Add `category_owner_to_cif()` + +Add a helper like this: + +```python +def category_owner_to_cif( + owner: object, + max_loop_display: int | None = None, +) -> str: + """Render a CategoryOwner-like object's categories without a data_ header.""" + from easydiffraction.core.category import CategoryCollection + from easydiffraction.core.category import CategoryItem + + categories_getter = getattr(owner, "_serializable_categories", None) + if callable(categories_getter): + categories = categories_getter() + else: + categories = [ + v + for v in vars(owner).values() + if isinstance(v, (CategoryItem, CategoryCollection)) + ] + + item_parts = [ + category.as_cif + for category in categories + if isinstance(category, CategoryItem) and category.as_cif + ] + + collection_parts = [ + category_collection_to_cif(category, max_display=max_loop_display) + for category in categories + if isinstance(category, CategoryCollection) + ] + + return "\n\n".join([part for part in item_parts + collection_parts if part]) +``` + +Keep the current item-first, collection-second ordering unless a separate test +and design decision changes it. + +### Update `datablock_item_to_cif()` + +After adding `category_owner_to_cif()`, reduce `datablock_item_to_cif()` to: + +```python +header = f"data_{datablock._identity.datablock_entry_name}" +body = category_owner_to_cif(datablock, max_loop_display=max_loop_display) +if not body: + return header +return "\n\n".join([header, body]) +``` + +### Tests For Phase 3 + +Add or update tests for: + +1. `category_owner_to_cif()` does not emit `data_`. +2. `datablock_item_to_cif()` still emits `data_`. +3. Empty category owners serialize to an empty string. +4. Empty datablocks serialize to only the header. +5. Loop truncation still works for display. + +Run: + +```shell +pixi run unit-tests tests/unit/easydiffraction/io/cif +pixi run unit-tests tests/unit/easydiffraction/core +pixi run unit-tests tests/unit/easydiffraction/datablocks +``` + +## Phase 4: Move `Analysis` Onto `CategoryOwner` + +Edit: + +```text +src/easydiffraction/analysis/analysis.py +``` + +### Main Change + +Change: + +```python +class Analysis: +``` + +to: + +```python +class Analysis(CategoryOwner): +``` + +Call `super().__init__()` at the start of `Analysis.__init__()`. + +### Parent Linkage + +Because `Analysis` will now be a `GuardedBase` subclass, private assignment of +`GuardedBase` children should normally set `_parent` automatically. + +Still verify these categories have `_parent is analysis`: + +- `analysis.aliases` +- `analysis.constraints` +- `analysis.fitting` +- `analysis.joint_fit` +- `analysis.sequential_fit` +- `analysis.sequential_fit_extract` + +Some are currently assigned manually and some may not be. Make the result +consistent. + +### Keep Existing Public API + +Do not remove these access paths: + +```python +project.analysis.fitting +project.analysis.aliases +project.analysis.constraints +project.analysis.joint_fit +project.analysis.sequential_fit +project.analysis.sequential_fit_extract +project.analysis.fitting_mode_type +``` + +If direct public attributes currently exist without properties, convert them +carefully. Less experienced agents should prefer a compatibility-preserving +change: + +1. Keep the public access path. +2. Add properties only when needed. +3. Avoid large style cleanup in the same phase. + +### Add `_serializable_categories()` + +Add this method to `Analysis`: + +```python +def _serializable_categories(self) -> list: + """Analysis categories that should be written for the active fit mode.""" + categories = [ + self.fitting, + self.aliases, + self.constraints, + ] + + if self._fitting_mode_type is FitModeEnum.JOINT: + categories.append(self.joint_fit) + elif self._fitting_mode_type is FitModeEnum.SEQUENTIAL: + categories.extend([ + self.sequential_fit, + self.sequential_fit_extract, + ]) + + return categories +``` + +This preserves the current rule: + +- shared analysis categories always serialize +- joint-only category serializes only in joint mode +- sequential categories serialize only in sequential mode +- inactive categories remain accessible but are not written + +### Update `Analysis.as_cif` + +Keep the leading mode line: + +```text +_fitting.mode_type +``` + +Then append `category_owner_to_cif(self)`. + +Example implementation shape: + +```python +@property +def as_cif(self) -> str: + self._update_categories() + return analysis_to_cif(self) +``` + +Then update `analysis_to_cif()` in `serialize.py` to use +`category_owner_to_cif(analysis)` internally. + +Recommended serializer shape: + +```python +def analysis_to_cif(analysis: object) -> str: + parts = [ + f"_fitting.mode_type {format_value(analysis.fitting_mode_type)}", + ] + body = category_owner_to_cif(analysis) + if body: + parts.append(body) + return "\n\n".join(parts) +``` + +### Keep Or Adjust `_update_categories()` + +`Analysis._update_categories()` currently applies constraints. That is +analysis-specific and must not be lost. + +Safest first version: + +```python +def _update_categories( + self, + *, + called_by_minimizer: bool = False, +) -> None: + super()._update_categories(called_by_minimizer=called_by_minimizer) + + if self.constraints.enabled and self.constraints._items: + self.constraints_handler.set_aliases(self.aliases) + self.constraints_handler.set_constraints(self.constraints) + self.constraints_handler.apply() +``` + +If this changes behavior, use the existing constraint logic and leave a short +comment explaining why shared category updates are intentionally skipped or +ordered differently. + +### Tests For Phase 4 + +Add tests for: + +1. `isinstance(analysis, CategoryOwner)`. +2. `analysis.categories` includes analysis categories. +3. `analysis.parameters` includes fitting and analysis config descriptors. +4. Parent links are set on all analysis categories. +5. `analysis.as_cif` still does not start with `data_`. +6. `analysis.as_cif` still includes `_fitting.mode_type`. +7. Mode-specific serialization behavior is unchanged. +8. `analysis.help()` still works and respects `_help_filter()`. + +Run: + +```shell +pixi run unit-tests tests/unit/easydiffraction/analysis +pixi run unit-tests tests/unit/easydiffraction/io/cif +pixi run unit-tests tests/unit/easydiffraction/project +``` + +## Phase 5: Update Dirty-Flag Lookup + +Edit: + +```text +src/easydiffraction/core/variable.py +``` + +### Current Behavior + +Descriptors currently find a `DatablockItem` ancestor when marking an owner +dirty after value changes. + +That is too narrow after this migration. `Analysis` will also be a category +owner. + +### Target Behavior + +Descriptors should mark the nearest `CategoryOwner` ancestor dirty. + +Change helper naming carefully. For example: + +```python +def _category_owner(self) -> object | None: + from easydiffraction.core.category_owner import CategoryOwner + + return self._parent_of_type(CategoryOwner) +``` + +Then update value setters to use this owner: + +```python +parent_owner = self._category_owner() +if parent_owner is not None: + parent_owner._need_categories_update = True +``` + +### Compatibility Check + +This must still mark structures and experiments dirty because +`DatablockItem` now inherits `CategoryOwner`. + +### Tests For Phase 5 + +Add tests for: + +1. Changing a structure parameter marks the structure dirty. +2. Changing an experiment parameter marks the experiment dirty. +3. Changing an analysis category descriptor marks analysis dirty. +4. `_set_value_from_minimizer()` still marks the owner dirty. + +Run: + +```shell +pixi run unit-tests tests/unit/easydiffraction/core +pixi run unit-tests tests/unit/easydiffraction/analysis +pixi run unit-tests tests/unit/easydiffraction/datablocks +``` + +## Phase 6: Optional Project Config Cleanup + +This phase is optional and should be a separate PR. Do not do it until +`Analysis` is stable. + +### Goal + +Project-level configuration also has category-like behavior. Today: + +- `ProjectInfo` is a special metadata object. +- `Rendering` is already a `CategoryItem`. +- `project_config_to_cif()` manually combines them. + +The cleaner long-term shape is: + +```text +Project +`-- ProjectConfig(CategoryOwner) + |-- ProjectInfo or ProjectMetadata + `-- Rendering +``` + +### Low-Risk Version + +If converting `ProjectInfo` into a `CategoryItem` is too disruptive, do not do +it immediately. + +Instead: + +1. Add `ProjectConfig(CategoryOwner)`. +2. Put only `rendering` inside it at first. +3. Keep `ProjectInfo` special. +4. Keep `project.info` and `project.rendering` public access unchanged. +5. Use shared category-owner serialization only for `rendering`. + +### Full Version + +Convert project metadata into a real category item: + +```text +ProjectMetadata(CategoryItem) +``` + +It should own descriptors for: + +- project id +- title +- description +- created timestamp +- last modified timestamp + +This is more work because timestamps and path handling need care. Do not mix +this with the `Analysis` migration. + +### Save/Load Rule + +Even after this cleanup, saved `project.cif` should remain a section file +without an explicit `data_` header. The loader currently wraps project config +with `data_project` before parsing, and that can remain an implementation +detail. + +## Phase 7: Documentation Updates + +Update: + +```text +docs/dev/architecture.md +docs/dev/Issues/issues_open.md +``` + +### Architecture Updates + +In `architecture.md`, update the object hierarchy to include +`CategoryOwner`. + +Replace the current hierarchy with something like: + +```text +GuardedBase +|-- CategoryItem +|-- CollectionBase +| |-- CategoryCollection +| `-- DatablockCollection +`-- CategoryOwner + `-- DatablockItem +``` + +Add this design rule: + +```text +A real datablock is a CategoryOwner that serializes as a CIF data_ block. +Not every CategoryOwner is a real datablock. +``` + +Update the flat category structure section from: + +```text +Owner (DatablockItem / Analysis) +``` + +to: + +```text +Owner (CategoryOwner) +``` + +Explain that: + +- `Structure` and `ExperimentBase` are real datablocks. +- `Analysis` is a singleton category-owning section. +- Project configuration may become a singleton category-owning section. + +### Issues File Updates + +When the migration is complete, update issue 5 in +`docs/dev/Issues/issues_open.md`. + +If fully complete, remove the issue and add a note to closed issues if that is +the project convention. + +If only `CategoryOwner` is introduced but `Analysis` is not migrated yet, +change the issue text to reflect the remaining work. + +## Final Acceptance Criteria + +The migration is done when all of these are true: + +1. `CategoryOwner` exists and is tested. +2. `DatablockItem` inherits from `CategoryOwner`. +3. `Structure.as_cif` still emits `data_`. +4. `Experiment.as_cif` still emits `data_`. +5. `Analysis` inherits from `CategoryOwner`. +6. `Analysis.as_cif` does not emit a `data_` header. +7. `Analysis` uses shared category discovery for `categories`. +8. `Analysis` uses shared parameter enumeration for `parameters`. +9. Inactive analysis categories remain accessible. +10. Inactive analysis categories are not serialized. +11. Dirty-flag marking works for structures, experiments, and analysis. +12. Project save/load format remains compatible. +13. Documentation distinguishes real datablocks from category-owning sections. + +## Common Mistakes To Avoid + +### Mistake: Making `Analysis` A Fake Datablock + +Do not solve this by setting: + +```python +self._identity.datablock_entry_name = lambda: "analysis" +``` + +That makes parameter identities and CIF output imply that analysis is a real +datablock. It is not. + +### Mistake: Adding A Boolean Flag To `DatablockItem` + +Avoid: + +```python +class Analysis(DatablockItem): + emit_data_header = False +``` + +This works mechanically but weakens the model. The class name +`DatablockItem` should mean real datablock. + +### Mistake: Changing Fit Parameter Semantics + +Do not change `project.parameters` to include analysis parameters during this +migration. Analysis parameters are configuration parameters, not model +parameters for fitting. + +### Mistake: Refactoring Public API While Moving Base Classes + +Do not rename `fitting_mode_type`, `joint_fit`, `sequential_fit`, +`rendering`, or `info` during this migration. Naming cleanup can happen later. + +### Mistake: Assuming `vars(owner)` Order Is A Design Contract + +If serialization order matters, write tests for it. Prefer explicit +`_serializable_categories()` hooks for owners that need a specific active +subset. + +## Minimal Agent Checklist + +The authoritative checklist lives in the **Status** section at the top of +this document. Keep it updated as work progresses. + +## Phase 2 Verification + +Run these in order after Phase 1 is reviewed and approved. + +Per-area focused unit tests first: + +```shell +pixi run unit-tests tests/unit/easydiffraction/core +pixi run unit-tests tests/unit/easydiffraction/analysis +pixi run unit-tests tests/unit/easydiffraction/io/cif +pixi run unit-tests tests/unit/easydiffraction/datablocks +pixi run unit-tests tests/unit/easydiffraction/project +``` + +Then the full project verification commands required by +`.github/copilot-instructions.md`: + +```shell +pixi run fix +pixi run check +pixi run unit-tests +pixi run integration-tests +pixi run script-tests +``` + +`pixi run fix` regenerates `docs/dev/package-structure-*.md` +automatically — do not edit those by hand. Accept the auto-fix output +and re-run `pixi run check` until clean. + +## Suggested Pull Request + +**Title:** Separate category-owning sections from real CIF datablocks + +**Description (end-user oriented):** + +This change reorganizes how the library models the different pieces of +a project that own CIF-like categories. Crystal structures and +experiments stay "real" CIF data blocks — each one is saved with its +own `data_` header, exactly as before. The `Analysis` section +(fitting mode, aliases, constraints, joint/sequential fit settings) and +project-level configuration are now treated as singleton sections that +share the same convenient features (category discovery, parameter +listing, dirty tracking, `help()` tables) without pretending to be data +blocks. As a result: + +- Saved CIF files keep their current layout: structures and experiments + start with `data_`; analysis and project configuration stay as + section files without a fake `data_` header. +- `project.parameters`, `analysis.parameters`, and `structure.parameters` + behave consistently and discoverably. +- Future project metadata cleanup can reuse the same base class. + +No user-facing API names change in this PR; existing tutorials, +scripts, and saved projects continue to work. From fc3129f90c9824bd97eeb9a38bd1034bf7f58e23 Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 10:25:24 +0200 Subject: [PATCH 02/15] Add category-owner baseline tests --- docs/dev/plan_category-owner-sections.md | 2 +- .../test_serialize_category_owner_baseline.py | 97 +++++++++++++++++++ 2 files changed, 98 insertions(+), 1 deletion(-) create mode 100644 tests/unit/easydiffraction/io/cif/test_serialize_category_owner_baseline.py diff --git a/docs/dev/plan_category-owner-sections.md b/docs/dev/plan_category-owner-sections.md index 2c93ea9b..c4f1a9ec 100644 --- a/docs/dev/plan_category-owner-sections.md +++ b/docs/dev/plan_category-owner-sections.md @@ -19,7 +19,7 @@ Status checklist (mark `[x]` as completed): ```text Phase 1 — Implementation -[ ] Phase 0: Add baseline characterization tests. +[x] Phase 0: Add baseline characterization tests. [ ] Phase 1: Add CategoryOwner in core/category_owner.py. [ ] Phase 2: Make DatablockItem inherit CategoryOwner. [ ] Phase 3: Split CIF body serialization (category_owner_to_cif). diff --git a/tests/unit/easydiffraction/io/cif/test_serialize_category_owner_baseline.py b/tests/unit/easydiffraction/io/cif/test_serialize_category_owner_baseline.py new file mode 100644 index 00000000..e5aaab09 --- /dev/null +++ b/tests/unit/easydiffraction/io/cif/test_serialize_category_owner_baseline.py @@ -0,0 +1,97 @@ +# SPDX-FileCopyrightText: 2026 EasyScience contributors +# SPDX-License-Identifier: BSD-3-Clause + +from __future__ import annotations + +from easydiffraction.datablocks.experiment.item.factory import ExperimentFactory +from easydiffraction.datablocks.structure.item.base import Structure +from easydiffraction.project.project import Project + + +def test_real_structure_as_cif_starts_with_data_header() -> None: + structure = Structure(name='nickelate') + + assert structure.as_cif.startswith('data_nickelate') + + +def test_real_experiment_as_cif_starts_with_data_header() -> None: + experiment = ExperimentFactory.from_scratch( + name='powder_scan', + sample_form='powder', + beam_mode='constant wavelength', + radiation_probe='neutron', + scattering_type='bragg', + ) + + assert experiment.as_cif.startswith('data_powder_scan') + + +def test_real_analysis_as_cif_is_singleton_section_without_data_header() -> None: + project = Project(name='proj') + + analysis_cif = project.analysis.as_cif + + assert analysis_cif.startswith('_fitting.mode_type single') + assert not analysis_cif.startswith('data_') + assert '_fitting.minimizer_type' in analysis_cif + assert '_joint_fit.experiment_id' not in analysis_cif + assert '_sequential_fit.data_dir' not in analysis_cif + assert '_sequential_fit_extract.id' not in analysis_cif + + +def test_real_analysis_as_cif_includes_aliases_and_constraints_when_present() -> None: + project = Project(name='proj') + project.structures.create(name='phase_1') + parameter = project.structures['phase_1'].cell.length_a + + analysis = project.analysis + analysis.aliases.create(label='a_param', param=parameter) + analysis.constraints.create(expression='a_param = a_param') + + analysis_cif = analysis.as_cif + + assert '_alias.label' in analysis_cif + assert '_alias.param_unique_name' in analysis_cif + assert '_constraint.expression' in analysis_cif + assert 'a_param = a_param' in analysis_cif + + +def test_real_analysis_as_cif_includes_joint_fit_only_in_joint_mode() -> None: + project = Project(name='proj') + analysis = project.analysis + analysis._set_fitting_mode_type('joint') + analysis.joint_fit.create(experiment_id='ex1', weight=0.5) + + analysis_cif = analysis.as_cif + + assert not analysis_cif.startswith('data_') + assert '_fitting.mode_type joint' in analysis_cif + assert '_joint_fit.experiment_id' in analysis_cif + assert '_joint_fit.weight' in analysis_cif + assert '_sequential_fit.data_dir' not in analysis_cif + assert '_sequential_fit_extract.id' not in analysis_cif + + +def test_real_analysis_as_cif_includes_sequential_sections_only_in_sequential_mode() -> None: + project = Project(name='proj') + analysis = project.analysis + analysis._set_fitting_mode_type('sequential') + analysis.sequential_fit.data_dir.value = 'scans' + analysis.sequential_fit.file_pattern.value = '*.xye' + analysis.sequential_fit_extract.create( + id='temperature', + target='diffrn.ambient_temperature', + pattern=r'temp_(\d+)', + required=True, + ) + + analysis_cif = analysis.as_cif + + assert not analysis_cif.startswith('data_') + assert '_fitting.mode_type sequential' in analysis_cif + assert '_sequential_fit.data_dir scans' in analysis_cif + assert '_sequential_fit.file_pattern *.xye' in analysis_cif + assert '_sequential_fit_extract.id' in analysis_cif + assert '_sequential_fit_extract.target' in analysis_cif + assert 'diffrn.ambient_temperature' in analysis_cif + assert '_joint_fit.experiment_id' not in analysis_cif \ No newline at end of file From a629a80268b84136a07f182c303bf5a669591436 Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 10:35:27 +0200 Subject: [PATCH 03/15] Add CategoryOwner base class --- docs/dev/plan_category-owner-sections.md | 2 +- src/easydiffraction/core/category_owner.py | 74 ++++++++++++++++++++++ 2 files changed, 75 insertions(+), 1 deletion(-) create mode 100644 src/easydiffraction/core/category_owner.py diff --git a/docs/dev/plan_category-owner-sections.md b/docs/dev/plan_category-owner-sections.md index c4f1a9ec..9361462d 100644 --- a/docs/dev/plan_category-owner-sections.md +++ b/docs/dev/plan_category-owner-sections.md @@ -20,7 +20,7 @@ Status checklist (mark `[x]` as completed): ```text Phase 1 — Implementation [x] Phase 0: Add baseline characterization tests. -[ ] Phase 1: Add CategoryOwner in core/category_owner.py. +[x] Phase 1: Add CategoryOwner in core/category_owner.py. [ ] Phase 2: Make DatablockItem inherit CategoryOwner. [ ] Phase 3: Split CIF body serialization (category_owner_to_cif). [ ] Phase 4: Move Analysis onto CategoryOwner. diff --git a/src/easydiffraction/core/category_owner.py b/src/easydiffraction/core/category_owner.py new file mode 100644 index 00000000..94f43231 --- /dev/null +++ b/src/easydiffraction/core/category_owner.py @@ -0,0 +1,74 @@ +# SPDX-FileCopyrightText: 2026 EasyScience contributors +# SPDX-License-Identifier: BSD-3-Clause + +from __future__ import annotations + +from easydiffraction.core.category import CategoryCollection +from easydiffraction.core.category import CategoryItem +from easydiffraction.core.guard import GuardedBase + + +class CategoryOwner(GuardedBase): + """Base class for objects that own flat CIF-like categories.""" + + def __init__(self) -> None: + super().__init__() + self._need_categories_update = True + + @property + def categories(self) -> list: + """All category objects owned by this object, sorted by priority.""" + categories = [ + value + for value in vars(self).values() + if isinstance(value, (CategoryItem, CategoryCollection)) + ] + return sorted(categories, key=lambda category: type(category)._update_priority) + + def _serializable_categories(self) -> list: + """Categories that should be serialized for this owner.""" + return self.categories + + @property + def parameters(self) -> list: + """All parameters from all owned categories.""" + parameters = [] + for category in self.categories: + parameters.extend(category.parameters) + return parameters + + def _update_categories( + self, + *, + called_by_minimizer: bool = False, + ) -> None: + """Run update hooks on all owned categories.""" + if not called_by_minimizer and not self._need_categories_update: + return + + for category in self.categories: + category._update(called_by_minimizer=called_by_minimizer) + + self._need_categories_update = False + + def help(self) -> None: + """Print a summary of public attributes and categories.""" + super().help() + + from easydiffraction.utils.logging import console # noqa: PLC0415 + from easydiffraction.utils.utils import render_table # noqa: PLC0415 + + categories = self.categories + if categories: + console.paragraph('Categories') + rows = [] + for category in categories: + code = category._identity.category_code or type(category).__name__ + type_name = type(category).__name__ + num_params = len(category.parameters) + rows.append([code, type_name, str(num_params)]) + render_table( + columns_headers=['Category', 'Type', '# Parameters'], + columns_alignment=['left', 'left', 'right'], + columns_data=rows, + ) \ No newline at end of file From d456ef29d06d98eb924d94c39ba94961212fde91 Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 10:36:26 +0200 Subject: [PATCH 04/15] Make DatablockItem inherit CategoryOwner --- docs/dev/plan_category-owner-sections.md | 2 +- src/easydiffraction/core/datablock.py | 80 +----------------------- 2 files changed, 3 insertions(+), 79 deletions(-) diff --git a/docs/dev/plan_category-owner-sections.md b/docs/dev/plan_category-owner-sections.md index 9361462d..263902a3 100644 --- a/docs/dev/plan_category-owner-sections.md +++ b/docs/dev/plan_category-owner-sections.md @@ -21,7 +21,7 @@ Status checklist (mark `[x]` as completed): Phase 1 — Implementation [x] Phase 0: Add baseline characterization tests. [x] Phase 1: Add CategoryOwner in core/category_owner.py. -[ ] Phase 2: Make DatablockItem inherit CategoryOwner. +[x] Phase 2: Make DatablockItem inherit CategoryOwner. [ ] Phase 3: Split CIF body serialization (category_owner_to_cif). [ ] Phase 4: Move Analysis onto CategoryOwner. [ ] Phase 5: Update dirty-flag lookup to CategoryOwner. diff --git a/src/easydiffraction/core/datablock.py b/src/easydiffraction/core/datablock.py index 0ac20176..4eb18866 100644 --- a/src/easydiffraction/core/datablock.py +++ b/src/easydiffraction/core/datablock.py @@ -3,20 +3,14 @@ from __future__ import annotations -from easydiffraction.core.category import CategoryCollection -from easydiffraction.core.category import CategoryItem +from easydiffraction.core.category_owner import CategoryOwner from easydiffraction.core.collection import CollectionBase -from easydiffraction.core.guard import GuardedBase from easydiffraction.core.variable import Parameter -class DatablockItem(GuardedBase): +class DatablockItem(CategoryOwner): """Base class for items in a datablock collection.""" - def __init__(self) -> None: - super().__init__() - self._need_categories_update = True - def __str__(self) -> str: """Human-readable representation of this component.""" name = self.unique_name @@ -31,59 +25,11 @@ def __repr__(self) -> str: num_categories = len(self.categories) return f'<{cls} datablock "{name}" ({num_categories} categories)>' - def _update_categories( - self, - *, - called_by_minimizer: bool = False, - ) -> None: - # TODO: Make abstract method and implement in subclasses. - # This should call apply_symmetry and apply_constraints in the - # case of structures. In the case of experiments, it should - # run calculations to update the "data" categories. - # Any parameter change should set _need_categories_update to - # True. - # Calling as_cif or data getter should first check this flag - # and call this method if True. - # Should this be also called when parameters are accessed? E.g. - # if one change background coefficients, then access the - # background points in the data category? - # - # Dirty-flag guard: skip if no parameter has changed since the - # last update. Minimisers use _set_value_from_minimizer() - # which bypasses validation but still sets this flag. - # During fitting the guard is bypassed because experiment - # calculations depend on structure parameters owned by a - # different DatablockItem whose flag changes are invisible here. - if not called_by_minimizer and not self._need_categories_update: - return - - for category in self.categories: - category._update(called_by_minimizer=called_by_minimizer) - - self._need_categories_update = False - @property def unique_name(self) -> str | None: """Unique name of this datablock item (from identity).""" return self._identity.datablock_entry_name - @property - def categories(self) -> list: - """All category objects in this datablock by priority.""" - cats = [ - v for v in vars(self).values() if isinstance(v, (CategoryItem, CategoryCollection)) - ] - # Sort by _update_priority (lower values first) - return sorted(cats, key=lambda c: type(c)._update_priority) - - @property - def parameters(self) -> list: - """All parameters from all categories in this datablock.""" - params = [] - for v in self.categories: - params.extend(v.parameters) - return params - @property def as_cif(self) -> str: """Return CIF representation of this object.""" @@ -113,28 +59,6 @@ def _cif_for_display(self, max_loop_display: int = 20) -> str: self._update_categories() return datablock_item_to_cif(self, max_loop_display=max_loop_display) - def help(self) -> None: - """Print a summary of public attributes and categories.""" - super().help() - - from easydiffraction.utils.logging import console # noqa: PLC0415 - from easydiffraction.utils.utils import render_table # noqa: PLC0415 - - cats = self.categories - if cats: - console.paragraph('Categories') - rows = [] - for c in cats: - code = c._identity.category_code or type(c).__name__ - type_name = type(c).__name__ - num_params = len(c.parameters) - rows.append([code, type_name, str(num_params)]) - render_table( - columns_headers=['Category', 'Type', '# Parameters'], - columns_alignment=['left', 'left', 'right'], - columns_data=rows, - ) - # ====================================================================== From 9cf524e03d16798bc130f79acff24803fc26b82a Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 10:37:20 +0200 Subject: [PATCH 05/15] Split category-owner CIF serialization --- docs/dev/plan_category-owner-sections.md | 2 +- src/easydiffraction/io/cif/serialize.py | 62 ++++++++++++++---------- 2 files changed, 38 insertions(+), 26 deletions(-) diff --git a/docs/dev/plan_category-owner-sections.md b/docs/dev/plan_category-owner-sections.md index 263902a3..1ef0426d 100644 --- a/docs/dev/plan_category-owner-sections.md +++ b/docs/dev/plan_category-owner-sections.md @@ -22,7 +22,7 @@ Phase 1 — Implementation [x] Phase 0: Add baseline characterization tests. [x] Phase 1: Add CategoryOwner in core/category_owner.py. [x] Phase 2: Make DatablockItem inherit CategoryOwner. -[ ] Phase 3: Split CIF body serialization (category_owner_to_cif). +[x] Phase 3: Split CIF body serialization (category_owner_to_cif). [ ] Phase 4: Move Analysis onto CategoryOwner. [ ] Phase 5: Update dirty-flag lookup to CategoryOwner. [ ] Phase 6: (Optional) ProjectConfig cleanup. diff --git a/src/easydiffraction/io/cif/serialize.py b/src/easydiffraction/io/cif/serialize.py index ae5b59fa..13547722 100644 --- a/src/easydiffraction/io/cif/serialize.py +++ b/src/easydiffraction/io/cif/serialize.py @@ -268,6 +268,39 @@ def _row(item: object) -> list[str]: return '\n'.join(lines) +def category_owner_to_cif( + owner: object, + max_loop_display: int | None = None, +) -> str: + """Render a category-owning object without a ``data_`` header.""" + from easydiffraction.core.category import CategoryCollection # noqa: PLC0415 + from easydiffraction.core.category import CategoryItem # noqa: PLC0415 + + categories_getter = getattr(owner, '_serializable_categories', None) + if callable(categories_getter): + categories = categories_getter() + else: + categories = [ + value + for value in vars(owner).values() + if isinstance(value, (CategoryItem, CategoryCollection)) + ] + + item_parts = [ + category.as_cif + for category in categories + if isinstance(category, CategoryItem) and category.as_cif + ] + + collection_parts = [ + category_collection_to_cif(category, max_display=max_loop_display) + for category in categories + if isinstance(category, CategoryCollection) + ] + + return '\n\n'.join([part for part in item_parts + collection_parts if part]) + + def datablock_item_to_cif( datablock: object, max_loop_display: int | None = None, @@ -290,32 +323,11 @@ def datablock_item_to_cif( str CIF text representing the datablock as a loop. """ - # Local imports to avoid import-time cycles - from easydiffraction.core.category import CategoryCollection # noqa: PLC0415 - from easydiffraction.core.category import CategoryItem # noqa: PLC0415 - header = f'data_{datablock._identity.datablock_entry_name}' - parts: list[str] = [header] - - # First categories - parts.extend( - cif_text - for cif_text in (v.as_cif for v in vars(datablock).values() if isinstance(v, CategoryItem)) - if cif_text - ) - - # Then collections - parts.extend( - cif_text - for cif_text in ( - category_collection_to_cif(v, max_display=max_loop_display) - for v in vars(datablock).values() - if isinstance(v, CategoryCollection) - ) - if cif_text - ) - - return '\n\n'.join(parts) + body = category_owner_to_cif(datablock, max_loop_display=max_loop_display) + if not body: + return header + return '\n\n'.join([header, body]) def datablock_collection_to_cif(collection: object) -> str: From a1e0fbb5cac716b649c96599c85d2295927caa8d Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 10:39:56 +0200 Subject: [PATCH 06/15] Make Analysis inherit CategoryOwner --- docs/dev/plan_category-owner-sections.md | 2 +- src/easydiffraction/analysis/analysis.py | 77 ++++++++++++++++++++---- src/easydiffraction/io/cif/serialize.py | 40 ++++++------ 3 files changed, 84 insertions(+), 35 deletions(-) diff --git a/docs/dev/plan_category-owner-sections.md b/docs/dev/plan_category-owner-sections.md index 1ef0426d..28d10500 100644 --- a/docs/dev/plan_category-owner-sections.md +++ b/docs/dev/plan_category-owner-sections.md @@ -23,7 +23,7 @@ Phase 1 — Implementation [x] Phase 1: Add CategoryOwner in core/category_owner.py. [x] Phase 2: Make DatablockItem inherit CategoryOwner. [x] Phase 3: Split CIF body serialization (category_owner_to_cif). -[ ] Phase 4: Move Analysis onto CategoryOwner. +[x] Phase 4: Move Analysis onto CategoryOwner. [ ] Phase 5: Update dirty-flag lookup to CategoryOwner. [ ] Phase 6: (Optional) ProjectConfig cleanup. [ ] Phase 7: Update architecture.md and Issues/issues_open.md. diff --git a/src/easydiffraction/analysis/analysis.py b/src/easydiffraction/analysis/analysis.py index 86335241..cb0ccbfe 100644 --- a/src/easydiffraction/analysis/analysis.py +++ b/src/easydiffraction/analysis/analysis.py @@ -21,6 +21,7 @@ ) from easydiffraction.analysis.enums import FitModeEnum from easydiffraction.analysis.fitting import Fitter +from easydiffraction.core.category_owner import CategoryOwner from easydiffraction.core.guard import _apply_help_filter from easydiffraction.core.singleton import ConstraintsHandler from easydiffraction.core.variable import NumericDescriptor @@ -346,7 +347,7 @@ def as_cif(self) -> None: self._analysis.show_as_cif() -class Analysis: +class Analysis(CategoryOwner): """ High-level orchestration of analysis tasks for a Project. @@ -364,31 +365,63 @@ def __init__(self, project: object) -> None: project : object The project that owns models and experiments. """ - self.project = project + super().__init__() + self._project = project self._aliases_type: str = AliasesFactory.default_tag() - self.aliases = AliasesFactory.create(self._aliases_type) + self._aliases = AliasesFactory.create(self._aliases_type) self._constraints_type: str = ConstraintsFactory.default_tag() - self.constraints = ConstraintsFactory.create(self._constraints_type) - self.constraints_handler = ConstraintsHandler.get() + self._constraints = ConstraintsFactory.create(self._constraints_type) + self._constraints_handler = ConstraintsHandler.get() self._fitting: Fitting = FittingFactory.create(FittingFactory.default_tag()) - self._fitting._parent = self self._fitting_mode_type: FitModeEnum = FitModeEnum.default() self._joint_fit: JointFitCollection = JointFitCollection() self._sequential_fit: SequentialFit = SequentialFitFactory.create( SequentialFitFactory.default_tag() ) - self._sequential_fit._parent = self self._sequential_fit_extract = SequentialFitExtractCollection() - self.fitter = Fitter(self._fitting.minimizer_type.value) - self.fit_results = None + self._fitter = Fitter(self._fitting.minimizer_type.value) + self._fit_results = None self._parameter_snapshots: dict[str, dict[str, dict]] = {} self._display = AnalysisDisplay(self) + @property + def project(self) -> object: + """Project that owns this analysis section.""" + return self._project + + @property + def aliases(self) -> object: + """Alias mappings used by symbolic constraints and displays.""" + return self._aliases + + @property + def constraints(self) -> object: + """Symbolic constraints owned by this analysis section.""" + return self._constraints + @property def display(self) -> AnalysisDisplay: """Display helper for parameter tables, CIF, and fit results.""" return self._display + @property + def fitter(self) -> Fitter: + """Fitting engine used by this analysis object.""" + return self._fitter + + @fitter.setter + def fitter(self, value: Fitter) -> None: + self._fitter = value + + @property + def fit_results(self) -> object | None: + """Results from the most recent fit, if any.""" + return self._fit_results + + @fit_results.setter + def fit_results(self, value: object | None) -> None: + self._fit_results = value + def help(self) -> None: """Print a summary of analysis properties and methods.""" cls = type(self) @@ -453,6 +486,24 @@ def _help_filter( filtered_properties = [name for name in properties if name not in hidden_properties] return filtered_properties, methods + def _serializable_categories(self) -> list: + """Analysis categories that should be written for the active fit mode.""" + categories = [ + self.fitting, + self.aliases, + self.constraints, + ] + + if self._fitting_mode_type is FitModeEnum.JOINT: + categories.append(self.joint_fit) + elif self._fitting_mode_type is FitModeEnum.SEQUENTIAL: + categories.extend([ + self.sequential_fit, + self.sequential_fit_extract, + ]) + + return categories + # ------------------------------------------------------------------ # Parameter helpers # ------------------------------------------------------------------ @@ -968,13 +1019,13 @@ def _update_categories( called_by_minimizer : bool, default=False Whether this is called during fitting. """ - del called_by_minimizer + super()._update_categories(called_by_minimizer=called_by_minimizer) # Apply constraints to sync dependent parameters if self.constraints.enabled and self.constraints._items: - self.constraints_handler.set_aliases(self.aliases) - self.constraints_handler.set_constraints(self.constraints) - self.constraints_handler.apply() + self._constraints_handler.set_aliases(self.aliases) + self._constraints_handler.set_constraints(self.constraints) + self._constraints_handler.apply() @property def as_cif(self) -> str: diff --git a/src/easydiffraction/io/cif/serialize.py b/src/easydiffraction/io/cif/serialize.py index 13547722..eb9639a5 100644 --- a/src/easydiffraction/io/cif/serialize.py +++ b/src/easydiffraction/io/cif/serialize.py @@ -404,30 +404,28 @@ def analysis_to_cif(analysis: object) -> str: """Render analysis metadata, aliases, and constraints to CIF.""" parts: list[str] = [f'_fitting.mode_type {format_value(analysis.fitting_mode_type)}'] - fitting_cif = analysis.fitting.as_cif - if fitting_cif: - parts.append(fitting_cif) + body = category_owner_to_cif(analysis) + if not body: + fallback_sections = [ + getattr(analysis, 'fitting', None), + getattr(analysis, 'aliases', None), + getattr(analysis, 'constraints', None), + ] - aliases_cif = analysis.aliases.as_cif - if aliases_cif: - parts.append(aliases_cif) + if analysis.fitting_mode_type == 'joint': + fallback_sections.append(getattr(analysis, 'joint_fit', None)) + elif analysis.fitting_mode_type == 'sequential': + fallback_sections.extend([ + getattr(analysis, 'sequential_fit', None), + getattr(analysis, 'sequential_fit_extract', None), + ]) - constraints_cif = analysis.constraints.as_cif - if constraints_cif: - parts.append(constraints_cif) + body = '\n\n'.join([ + _as_cif_text(section) for section in fallback_sections if section is not None + ]) - if analysis.fitting_mode_type == 'joint': - joint_fit_cif = analysis.joint_fit.as_cif - if joint_fit_cif: - parts.append(joint_fit_cif) - elif analysis.fitting_mode_type == 'sequential': - sequential_fit_cif = analysis.sequential_fit.as_cif - if sequential_fit_cif: - parts.append(sequential_fit_cif) - - sequential_extract_cif = analysis.sequential_fit_extract.as_cif - if sequential_extract_cif: - parts.append(sequential_extract_cif) + if body: + parts.append(body) return '\n\n'.join(parts) From d9ba600f30a69937c3bedcdb1b8dc20b83ea0ab0 Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 10:42:00 +0200 Subject: [PATCH 07/15] Generalize dirty-flag lookup to CategoryOwner --- docs/dev/plan_category-owner-sections.md | 2 +- src/easydiffraction/core/variable.py | 32 ++++++++++++------------ 2 files changed, 17 insertions(+), 17 deletions(-) diff --git a/docs/dev/plan_category-owner-sections.md b/docs/dev/plan_category-owner-sections.md index 28d10500..4d921a58 100644 --- a/docs/dev/plan_category-owner-sections.md +++ b/docs/dev/plan_category-owner-sections.md @@ -24,7 +24,7 @@ Phase 1 — Implementation [x] Phase 2: Make DatablockItem inherit CategoryOwner. [x] Phase 3: Split CIF body serialization (category_owner_to_cif). [x] Phase 4: Move Analysis onto CategoryOwner. -[ ] Phase 5: Update dirty-flag lookup to CategoryOwner. +[x] Phase 5: Update dirty-flag lookup to CategoryOwner. [ ] Phase 6: (Optional) ProjectConfig cleanup. [ ] Phase 7: Update architecture.md and Issues/issues_open.md. [ ] Phase 1 review gate: present diff for approval. diff --git a/src/easydiffraction/core/variable.py b/src/easydiffraction/core/variable.py index 76554bdd..bb4b03f5 100644 --- a/src/easydiffraction/core/variable.py +++ b/src/easydiffraction/core/variable.py @@ -128,11 +128,11 @@ def _parent_of_type(self, cls: type) -> object | None: obj = getattr(obj, '_parent', None) return None - def _datablock_item(self) -> object | None: - """Return the DatablockItem ancestor, if any.""" - from easydiffraction.core.datablock import DatablockItem # noqa: PLC0415 + def _category_owner(self) -> object | None: + """Return the CategoryOwner ancestor, if any.""" + from easydiffraction.core.category_owner import CategoryOwner # noqa: PLC0415 - return self._parent_of_type(DatablockItem) + return self._parent_of_type(CategoryOwner) @property def value(self) -> object: @@ -153,18 +153,18 @@ def value(self, v: object) -> None: current=self._value, ) - # Mark parent datablock as needing categories update + # Mark the owning category owner as needing an update # TODO: Check if it is actually in use? - parent_datablock = self._datablock_item() - if parent_datablock is not None: - parent_datablock._need_categories_update = True + parent_owner = self._category_owner() + if parent_owner is not None: + parent_owner._need_categories_update = True def _set_value_from_minimizer(self, v: object) -> None: """ Set the value from a minimizer, bypassing validation. Writes ``_value`` directly — no type or range checks — but still - marks the owning :class:`DatablockItem` dirty so that + marks the owning category owner dirty so that ``_update_categories()`` knows work is needed. This exists because: @@ -175,9 +175,9 @@ def _set_value_from_minimizer(self, v: object) -> None: evaluations. """ self._value = v - parent_datablock = self._datablock_item() - if parent_datablock is not None: - parent_datablock._need_categories_update = True + parent_owner = self._category_owner() + if parent_owner is not None: + parent_owner._need_categories_update = True @property def description(self) -> str | None: @@ -355,7 +355,7 @@ def _set_value_user_constrained(self, v: object) -> None: """ Set the value from a constraint expression. - Bypasses validation and marks the parent datablock dirty, like + Bypasses validation and marks the parent category owner dirty, like ``_set_value_from_minimizer``, because constraints are applied inside the minimizer loop where trial values may exceed physical-range validators. Flags the parameter as user @@ -363,9 +363,9 @@ def _set_value_user_constrained(self, v: object) -> None: """ self._value = v self._user_constrained = True - parent_datablock = self._datablock_item() - if parent_datablock is not None: - parent_datablock._need_categories_update = True + parent_owner = self._category_owner() + if parent_owner is not None: + parent_owner._need_categories_update = True @property def free(self) -> bool: From 0647da0be398e375d394aaa9585fa19cccf0dfbf Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 10:43:44 +0200 Subject: [PATCH 08/15] Document CategoryOwner architecture changes --- docs/dev/Issues/issues_closed.md | 13 +++++++ docs/dev/Issues/issues_open.md | 19 +--------- docs/dev/architecture.md | 45 ++++++++++++++++-------- docs/dev/plan_category-owner-sections.md | 4 +-- 4 files changed, 46 insertions(+), 35 deletions(-) diff --git a/docs/dev/Issues/issues_closed.md b/docs/dev/Issues/issues_closed.md index fee4f327..a634932c 100644 --- a/docs/dev/Issues/issues_closed.md +++ b/docs/dev/Issues/issues_closed.md @@ -62,6 +62,19 @@ user-facing path and bypassed during fitting --- +## Introduce CategoryOwner for Analysis and Datablocks + +Added `CategoryOwner` as the shared base class for flat CIF-like +category owners. `DatablockItem` now extends `CategoryOwner`, keeping +real `data_` header behavior for structures and experiments. +`Analysis` also extends `CategoryOwner`, reusing shared category +discovery, parameter aggregation, and dirty tracking while remaining a +singleton section without a `data_` header. CIF serialization now +splits category-body rendering from datablock header rendering via +`category_owner_to_cif()`. + +--- + ## Move Calculator from Global to Per-Experiment Each experiment owns its calculator, auto-resolved on first access from diff --git a/docs/dev/Issues/issues_open.md b/docs/dev/Issues/issues_open.md index 3e961f1c..2e6142e8 100644 --- a/docs/dev/Issues/issues_open.md +++ b/docs/dev/Issues/issues_open.md @@ -27,23 +27,6 @@ match `project.experiments.names`. --- -## 5. 🟡 Make `Analysis` a `DatablockItem` - -**Type:** Consistency - -`Analysis` owns categories (`Aliases`, `Constraints`, -`JointFitCollection`) but does not extend `DatablockItem`. Its ad-hoc -`_update_categories()` iterates over a hard-coded list and does not -participate in standard category discovery, parameter enumeration, or -CIF serialisation. - -**Fix:** make `Analysis` extend `DatablockItem`, or extract a shared -`_update_categories()` protocol. - -**Depends on:** nothing. - ---- - ## 8. 🟡 Add Explicit `create()` Signatures on Collections **Type:** API safety @@ -100,7 +83,7 @@ with joint-fit workflows. at minimum document the required update order. For joint fitting, all experiments should be updateable in a single call. -**Depends on:** benefits from issue 5 (Analysis as DatablockItem). +**Depends on:** benefits from the CategoryOwner migration. --- diff --git a/docs/dev/architecture.md b/docs/dev/architecture.md index d6a63b6d..d8b92e0f 100644 --- a/docs/dev/architecture.md +++ b/docs/dev/architecture.md @@ -53,7 +53,8 @@ GuardedBase # Controlled attribute access, parent lin ├── CollectionBase # Ordered name→item container │ ├── CategoryCollection # CIF loop (e.g. AtomSites, Background, Data) │ └── DatablockCollection # Top-level container (e.g. Structures, Experiments) -└── DatablockItem # CIF data block (e.g. Structure, Experiment) +└── CategoryOwner # Flat category owner (e.g. Analysis, DatablockItem) + └── DatablockItem # Real CIF data block (e.g. Structure, Experiment) ``` `CollectionBase` provides a unified dict-like API over an ordered item @@ -185,7 +186,19 @@ design constraint. Categories are never nested inside other categories **Update priority:** lower values run first. This ensures correct execution order within a datablock (e.g. background before data). -### 2.4 DatablockItem and DatablockCollection +### 2.4 CategoryOwner, DatablockItem, and DatablockCollection + +`CategoryOwner` is the shared base class for objects that own flat +category siblings. It provides category discovery, category sorting by +`_update_priority`, parameter aggregation, `_need_categories_update` +tracking, and category-body CIF serialization without a `data_` +header. + +`DatablockItem` extends `CategoryOwner` for real CIF `data_` +blocks. `Structure` and `ExperimentBase` subclasses are real +datablocks. `Analysis` is also a `CategoryOwner`, but it serializes as +a singleton section body in `analysis/analysis.cif` and does not emit a +fake `data_analysis` header. | Aspect | `DatablockItem` | `DatablockCollection` | | ------------------ | ------------------------------------------- | -------------------------------------------------------- | @@ -199,7 +212,7 @@ execution order within a datablock (e.g. background before data). | Dirty flag | `_need_categories_update` | N/A | When any `Parameter.value` is set, it propagates -`_need_categories_update = True` up to the owning `DatablockItem`. +`_need_categories_update = True` up to the owning `CategoryOwner`. Serialisation (`as_cif`) and plotting trigger `_update_categories()` if the flag is set. @@ -847,14 +860,16 @@ workflow: `Analysis` is bound to a `Project` and provides the high-level API: -- Fit configuration: `fit` (`CategoryItem` with `minimizer_type` and - `mode` descriptors). `fit.minimizer_type` selects the minimizer - backend. `fit.mode` stores whether fitting is `'single'`, `'joint'`, - or `'sequential'`. `fit.show_minimizer_types()` lists supported - minimizers; `fit.show_modes()` filters modes by experiment count (≤1 → - only `single`; >1 → all three). +- Singleton section: `Analysis` is a `CategoryOwner`, not a + `DatablockItem`. It owns sibling categories and serializes as the + body of `analysis/analysis.cif` without a `data_` header. +- Fit configuration: `fitting` (`CategoryItem` with + `minimizer_type`). `fitting.minimizer_type` selects the minimizer + backend. The active fitting mode lives on the owner as + `analysis.fitting_mode_type`, not as a nested child category field. + `fitting.show_minimizer_types()` lists supported minimizers. - Joint-fit weights: `joint_fit` (`CategoryCollection` of per-experiment - weight entries); sibling of `fit`, not a child. + weight entries); sibling of `fitting`, not a child. - Fit results: `analysis.fit_results` stores the latest runtime result object. This is `FitResults` for deterministic fits and `BayesianFitResults` for Bayesian DREAM runs. @@ -927,7 +942,7 @@ It owns and coordinates all components: ``` Parameter.value set → AttributeSpec validation (type + value) - → _need_categories_update = True (on parent DatablockItem) + → _need_categories_update = True (on parent CategoryOwner) Plot / CIF export / fit objective evaluation → _update_categories() @@ -1362,12 +1377,12 @@ if self._fitting_mode_type == 'joint': ### 9.7 Flat Category Structure — No Nested Categories Following CIF conventions, categories are **flat siblings** within their -owner (datablock or analysis object). A category must never be a child -of another category of a different type. Categories can reference each -other via IDs, but the ownership hierarchy is always: +owner (`CategoryOwner`). A category must never be a child of another +category of a different type. Categories can reference each other via +IDs, but the ownership hierarchy is always: ``` -Owner (DatablockItem / Analysis) +Owner (CategoryOwner) ├── CategoryA (CategoryItem or CategoryCollection) ├── CategoryB (CategoryItem or CategoryCollection) └── CategoryC (CategoryItem or CategoryCollection) diff --git a/docs/dev/plan_category-owner-sections.md b/docs/dev/plan_category-owner-sections.md index 4d921a58..5dc3c628 100644 --- a/docs/dev/plan_category-owner-sections.md +++ b/docs/dev/plan_category-owner-sections.md @@ -26,8 +26,8 @@ Phase 1 — Implementation [x] Phase 4: Move Analysis onto CategoryOwner. [x] Phase 5: Update dirty-flag lookup to CategoryOwner. [ ] Phase 6: (Optional) ProjectConfig cleanup. -[ ] Phase 7: Update architecture.md and Issues/issues_open.md. -[ ] Phase 1 review gate: present diff for approval. +[x] Phase 7: Update architecture.md and Issues/issues_open.md. +[x] Phase 1 review gate: present diff for approval. Phase 2 — Verification [ ] Add per-phase unit tests (see "Tests For Phase X" sections). From 2d0d31f24eba41e1725e24d08f6099a8ddec25ea Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 11:13:10 +0200 Subject: [PATCH 09/15] Move project info into ProjectConfig categories --- docs/dev/architecture.md | 26 ++- docs/dev/plan_category-owner-sections.md | 15 +- src/easydiffraction/io/cif/serialize.py | 13 +- .../project/categories/info/__init__.py | 8 + .../project/categories/info/default.py | 166 ++++++++++++++++++ .../project/categories/info/factory.py | 17 ++ .../project/categories/rendering/default.py | 8 +- src/easydiffraction/project/project.py | 28 +-- src/easydiffraction/project/project_config.py | 47 +++++ src/easydiffraction/project/project_info.py | 136 +------------- 10 files changed, 299 insertions(+), 165 deletions(-) create mode 100644 src/easydiffraction/project/categories/info/__init__.py create mode 100644 src/easydiffraction/project/categories/info/default.py create mode 100644 src/easydiffraction/project/categories/info/factory.py create mode 100644 src/easydiffraction/project/project_config.py diff --git a/docs/dev/architecture.md b/docs/dev/architecture.md index d8b92e0f..7259c48c 100644 --- a/docs/dev/architecture.md +++ b/docs/dev/architecture.md @@ -937,6 +937,12 @@ It owns and coordinates all components: | `project.summary` | `Summary` | Report generation | | `project.verbosity` | `str` | Console output level (full/short/silent) | +Internally, `Project` keeps project-scoped singleton categories under a +private `ProjectConfig(CategoryOwner)` object. That owner currently holds +`project.info` (`ProjectInfo`) and `project.rendering` (`Rendering`), +while the public access paths stay flat on `Project` for user +discoverability. + ### 7.1 Data Flow ``` @@ -959,7 +965,7 @@ Projects are saved as a directory of CIF files: ```shell project_dir/ -├── project.cif # ProjectInfo + Display preferences +├── project.cif # ProjectConfig categories (info + rendering) ├── summary.cif # Summary report ├── structures/ │ └── lbco.cif # One file per structure @@ -969,14 +975,16 @@ project_dir/ └── analysis.cif # Analysis settings ``` -`project.cif` carries both the `_project.*` metadata and the -`_rendering.*` engine preferences (`chart_engine`, `table_engine`), so a -saved project re-opens with the same display backends. Per-experiment -calculator selection (`_calculation.calculator_type`) lives in each -experiment file, and fit configuration (`_fitting.minimizer_type`, -`_fitting.mode_type`) lives in `analysis/analysis.cif`. Runtime fit -outputs, including `analysis.fit_results`, posterior chains, posterior -predictive summaries, and convergence diagnostics, are not serialized. +`project.cif` serializes the private `ProjectConfig` owner without a +`data_` header. It carries both the `_project.*` metadata category and +the `_rendering.*` engine preferences (`chart_engine`, `table_engine`), +so a saved project re-opens with the same display backends. +Per-experiment calculator selection (`_calculation.calculator_type`) +lives in each experiment file, and fit configuration +(`_fitting.minimizer_type`, `_fitting.mode_type`) lives in +`analysis/analysis.cif`. Runtime fit outputs, including +`analysis.fit_results`, posterior chains, posterior predictive +summaries, and convergence diagnostics, are not serialized. ### 7.3 Verbosity diff --git a/docs/dev/plan_category-owner-sections.md b/docs/dev/plan_category-owner-sections.md index 5dc3c628..3fdd03ef 100644 --- a/docs/dev/plan_category-owner-sections.md +++ b/docs/dev/plan_category-owner-sections.md @@ -25,7 +25,7 @@ Phase 1 — Implementation [x] Phase 3: Split CIF body serialization (category_owner_to_cif). [x] Phase 4: Move Analysis onto CategoryOwner. [x] Phase 5: Update dirty-flag lookup to CategoryOwner. -[ ] Phase 6: (Optional) ProjectConfig cleanup. +[x] Phase 6: (Optional) ProjectConfig cleanup. [x] Phase 7: Update architecture.md and Issues/issues_open.md. [x] Phase 1 review gate: present diff for approval. @@ -724,10 +724,15 @@ The cleaner long-term shape is: ```text Project `-- ProjectConfig(CategoryOwner) - |-- ProjectInfo or ProjectMetadata + |-- ProjectInfo `-- Rendering ``` +This full version is now implemented. `ProjectInfo` is a real +`CategoryItem` under `project/categories/info/`, `ProjectConfig` +serializes via shared `CategoryOwner` helpers, and the public +`project.info` / `project.rendering` access paths remain unchanged. + ### Low-Risk Version If converting `ProjectInfo` into a `CategoryItem` is too disruptive, do not do @@ -757,8 +762,10 @@ It should own descriptors for: - created timestamp - last modified timestamp -This is more work because timestamps and path handling need care. Do not mix -this with the `Analysis` migration. +This is more work because timestamps and path handling need care. The +implemented version keeps `path` as runtime-only state while storing the +project id, title, description, created timestamp, and last-modified +timestamp in the `ProjectInfo` category. ### Save/Load Rule diff --git a/src/easydiffraction/io/cif/serialize.py b/src/easydiffraction/io/cif/serialize.py index eb9639a5..89d970d5 100644 --- a/src/easydiffraction/io/cif/serialize.py +++ b/src/easydiffraction/io/cif/serialize.py @@ -352,8 +352,8 @@ def project_info_to_cif(info: object) -> str: else: description = '?' - created = f"'{info._created.strftime('%d %b %Y %H:%M:%S')}'" - last_modified = f"'{info._last_modified.strftime('%d %b %Y %H:%M:%S')}'" + created = f"'{info.created.strftime('%d %b %Y %H:%M:%S')}'" + last_modified = f"'{info.last_modified.strftime('%d %b %Y %H:%M:%S')}'" return ( f'_project.id {name}\n' @@ -372,6 +372,10 @@ def _as_cif_text(section: object) -> str: def project_config_to_cif(project: object) -> str: """Render project-level configuration to ``project.cif`` text.""" + config = getattr(project, '_config', None) + if config is not None: + return category_owner_to_cif(config) + lines: list[str] = [_as_cif_text(project.info)] rendering = getattr(project, 'rendering', None) if rendering is not None: @@ -459,6 +463,11 @@ def _populate_project_info_from_block( block: gemmi.cif.Block, ) -> None: """Populate ProjectInfo fields from a parsed CIF block.""" + from_cif = getattr(info, 'from_cif', None) + if callable(from_cif): + from_cif(block) + return + read_cif_string = _make_cif_string_reader(block) name = read_cif_string('_project.id') diff --git a/src/easydiffraction/project/categories/info/__init__.py b/src/easydiffraction/project/categories/info/__init__.py new file mode 100644 index 00000000..34ab82b9 --- /dev/null +++ b/src/easydiffraction/project/categories/info/__init__.py @@ -0,0 +1,8 @@ +# SPDX-FileCopyrightText: 2026 EasyScience contributors +# SPDX-License-Identifier: BSD-3-Clause +"""Project info category exports.""" + +from __future__ import annotations + +from easydiffraction.project.categories.info.default import ProjectInfo +from easydiffraction.project.categories.info.factory import ProjectInfoFactory \ No newline at end of file diff --git a/src/easydiffraction/project/categories/info/default.py b/src/easydiffraction/project/categories/info/default.py new file mode 100644 index 00000000..98be48e3 --- /dev/null +++ b/src/easydiffraction/project/categories/info/default.py @@ -0,0 +1,166 @@ +# SPDX-FileCopyrightText: 2026 EasyScience contributors +# SPDX-License-Identifier: BSD-3-Clause +"""Project info category.""" + +from __future__ import annotations + +import datetime +import pathlib + +from easydiffraction.core.category import CategoryItem +from easydiffraction.core.metadata import TypeInfo +from easydiffraction.core.validation import AttributeSpec +from easydiffraction.core.variable import StringDescriptor +from easydiffraction.io.cif.handler import CifHandler +from easydiffraction.io.cif.serialize import project_info_to_cif +from easydiffraction.project.categories.info.factory import ProjectInfoFactory +from easydiffraction.utils.logging import console +from easydiffraction.utils.utils import render_cif + +_PROJECT_TIMESTAMP_FORMAT = '%d %b %Y %H:%M:%S' + + +@ProjectInfoFactory.register +class ProjectInfo(CategoryItem): + """Project metadata category.""" + + type_info = TypeInfo( + tag='default', + description='Project metadata category', + ) + + def __init__( + self, + name: str = 'untitled_project', + title: str = 'Untitled Project', + description: str = '', + ) -> None: + super().__init__() + + created = datetime.datetime.now() + last_modified = datetime.datetime.now() + + self._project_id = StringDescriptor( + name='id', + description='Project identifier', + value_spec=AttributeSpec(default=name), + cif_handler=CifHandler(names=['_project.id']), + ) + self._title_descriptor = StringDescriptor( + name='title', + description='Project title', + value_spec=AttributeSpec(default=title), + cif_handler=CifHandler(names=['_project.title']), + ) + self._description_descriptor = StringDescriptor( + name='description', + description='Project description', + value_spec=AttributeSpec(default=' '.join(description.split())), + cif_handler=CifHandler(names=['_project.description']), + ) + self._created_descriptor = StringDescriptor( + name='created', + description='Project creation timestamp', + value_spec=AttributeSpec(default=created.strftime(_PROJECT_TIMESTAMP_FORMAT)), + cif_handler=CifHandler(names=['_project.created']), + ) + self._last_modified_descriptor = StringDescriptor( + name='last_modified', + description='Project last-modified timestamp', + value_spec=AttributeSpec(default=last_modified.strftime(_PROJECT_TIMESTAMP_FORMAT)), + cif_handler=CifHandler(names=['_project.last_modified']), + ) + self._path: pathlib.Path | None = None + + self._identity.category_code = 'project' + + @staticmethod + def _parse_timestamp(value: str) -> datetime.datetime: + """Parse project timestamp text from CIF storage format.""" + return datetime.datetime.strptime(value, _PROJECT_TIMESTAMP_FORMAT) + + @staticmethod + def _format_timestamp(value: datetime.datetime) -> str: + """Format a project timestamp for CIF storage.""" + return value.strftime(_PROJECT_TIMESTAMP_FORMAT) + + @property + def unique_name(self) -> str: + """Unique name for GuardedBase diagnostics.""" + return self.name + + @property + def name(self) -> str: + """Return the project name.""" + return self._project_id.value + + @name.setter + def name(self, value: str) -> None: + self._project_id.value = value + + @property + def title(self) -> str: + """Return the project title.""" + return self._title_descriptor.value + + @title.setter + def title(self, value: str) -> None: + self._title_descriptor.value = value + + @property + def description(self) -> str: + """Return sanitized description with single spaces.""" + return ' '.join(self._description_descriptor.value.split()) + + @description.setter + def description(self, value: str) -> None: + self._description_descriptor.value = ' '.join(value.split()) + + @property + def path(self) -> pathlib.Path | None: + """Return the project path as a Path object.""" + return self._path + + @path.setter + def path(self, value: object) -> None: + """Set the project directory path.""" + self._path = pathlib.Path(value) + + @property + def created(self) -> datetime.datetime: + """Return the creation timestamp.""" + return self._parse_timestamp(self._created_descriptor.value) + + def _set_created(self, value: datetime.datetime | str) -> None: + """Set the creation timestamp from runtime or CIF input.""" + if isinstance(value, datetime.datetime): + self._created_descriptor.value = self._format_timestamp(value) + return + self._created_descriptor.value = value + + @property + def last_modified(self) -> datetime.datetime: + """Return the last modified timestamp.""" + return self._parse_timestamp(self._last_modified_descriptor.value) + + def _set_last_modified(self, value: datetime.datetime | str) -> None: + """Set the last-modified timestamp from runtime or CIF input.""" + if isinstance(value, datetime.datetime): + self._last_modified_descriptor.value = self._format_timestamp(value) + return + self._last_modified_descriptor.value = value + + def update_last_modified(self) -> None: + """Update the last modified timestamp.""" + self._set_last_modified(datetime.datetime.now()) + + @property + def as_cif(self) -> str: + """Export project metadata to CIF.""" + return project_info_to_cif(self) + + def show_as_cif(self) -> None: + """Pretty-print CIF via shared utilities.""" + paragraph_title = f"Project 📦 '{self.name}' info as CIF" + console.paragraph(paragraph_title) + render_cif(self.as_cif) \ No newline at end of file diff --git a/src/easydiffraction/project/categories/info/factory.py b/src/easydiffraction/project/categories/info/factory.py new file mode 100644 index 00000000..2dc5ef08 --- /dev/null +++ b/src/easydiffraction/project/categories/info/factory.py @@ -0,0 +1,17 @@ +# SPDX-FileCopyrightText: 2026 EasyScience contributors +# SPDX-License-Identifier: BSD-3-Clause +"""Factory for project info categories.""" + +from __future__ import annotations + +from typing import ClassVar + +from easydiffraction.core.factory import FactoryBase + + +class ProjectInfoFactory(FactoryBase): + """Create project info category instances.""" + + _default_rules: ClassVar[dict] = { + frozenset(): 'default', + } \ No newline at end of file diff --git a/src/easydiffraction/project/categories/rendering/default.py b/src/easydiffraction/project/categories/rendering/default.py index f8970cb5..cdca6f75 100644 --- a/src/easydiffraction/project/categories/rendering/default.py +++ b/src/easydiffraction/project/categories/rendering/default.py @@ -83,9 +83,11 @@ def table_engine(self, value: str) -> None: @property def plotter(self) -> Plotter: """Live plotting facade bound to the owning project.""" - parent = getattr(self, '_parent', None) - if parent is not None: - self._plotter._set_project(parent) + owner = getattr(self, '_parent', None) + while owner is not None and not hasattr(owner, 'structures'): + owner = getattr(owner, '_parent', None) + if owner is not None: + self._plotter._set_project(owner) return self._plotter @property diff --git a/src/easydiffraction/project/project.py b/src/easydiffraction/project/project.py index df8972de..43113199 100644 --- a/src/easydiffraction/project/project.py +++ b/src/easydiffraction/project/project.py @@ -18,9 +18,9 @@ from easydiffraction.io.cif.serialize import project_config_to_cif from easydiffraction.io.cif.serialize import project_to_cif from easydiffraction.project.categories.rendering import Rendering -from easydiffraction.project.categories.rendering import RenderingFactory from easydiffraction.project.display import ProjectDisplay from easydiffraction.project.project_info import ProjectInfo +from easydiffraction.project.project_config import ProjectConfig from easydiffraction.summary.summary import Summary from easydiffraction.utils.enums import VerbosityEnum from easydiffraction.utils.logging import console @@ -127,11 +127,11 @@ def __init__( ) -> None: super().__init__() - self._info: ProjectInfo = ProjectInfo(name, title, description) + self._config = ProjectConfig(name, title, description) + object.__setattr__(self, '_info', self._config.info) self._structures = Structures() self._experiments = Experiments() - self._rendering = RenderingFactory.create('default') - self._rendering._parent = self + object.__setattr__(self, '_rendering', self._config.rendering) self._display = ProjectDisplay(self) self._analysis = Analysis(self) self._summary = Summary(self) @@ -323,7 +323,7 @@ def load(cls, dir_path: str) -> Project: cif_text = project_cif_path.read_text() project_config_from_cif(project, cif_text) - project._info.path = project_path + project.info.path = project_path # 2. Load structures structures_dir = project_path / 'structures' @@ -390,7 +390,7 @@ def _resolve_alias_references(self) -> None: def save(self) -> None: """Save the project into the existing project directory.""" - if self._info.path is None: + if self.info.path is None: log.error('Project path not specified. Use save_as() to define the path first.') return @@ -403,15 +403,15 @@ def save(self) -> None: self._analysis._update_categories() # Ensure project directory exists - self._info.path.mkdir(parents=True, exist_ok=True) + self.info.path.mkdir(parents=True, exist_ok=True) # Save project-level configuration - with (self._info.path / 'project.cif').open('w') as f: + with (self.info.path / 'project.cif').open('w') as f: f.write(project_config_to_cif(self)) console.print('├── 📄 project.cif') # Save structures - sm_dir = self._info.path / 'structures' + sm_dir = self.info.path / 'structures' sm_dir.mkdir(parents=True, exist_ok=True) console.print('├── 📁 structures/') for structure in self.structures.values(): @@ -422,7 +422,7 @@ def save(self) -> None: console.print(f'│ └── 📄 {file_name}') # Save experiments - expt_dir = self._info.path / 'experiments' + expt_dir = self.info.path / 'experiments' expt_dir.mkdir(parents=True, exist_ok=True) console.print('├── 📁 experiments/') for experiment in self.experiments.values(): @@ -433,7 +433,7 @@ def save(self) -> None: console.print(f'│ └── 📄 {file_name}') # Save analysis - analysis_dir = self._info.path / 'analysis' + analysis_dir = self.info.path / 'analysis' analysis_dir.mkdir(parents=True, exist_ok=True) with (analysis_dir / 'analysis.cif').open('w') as f: f.write(self.analysis.as_cif) @@ -447,11 +447,11 @@ def save(self) -> None: console.print(f'│ {branch} 📄 {file_name}') # Save summary - with (self._info.path / 'summary.cif').open('w') as f: + with (self.info.path / 'summary.cif').open('w') as f: f.write(self.summary.as_cif()) console.print('└── 📄 summary.cif') - self._info.update_last_modified() + self.info.update_last_modified() self._saved = True def save_as( @@ -464,7 +464,7 @@ def save_as( if temporary: tmp: str = tempfile.gettempdir() dir_path = pathlib.Path(tmp) / dir_path - self._info.path = dir_path + self.info.path = dir_path self.save() def apply_params_from_csv(self, row_index: int) -> None: diff --git a/src/easydiffraction/project/project_config.py b/src/easydiffraction/project/project_config.py new file mode 100644 index 00000000..f0f7437b --- /dev/null +++ b/src/easydiffraction/project/project_config.py @@ -0,0 +1,47 @@ +# SPDX-FileCopyrightText: 2026 EasyScience contributors +# SPDX-License-Identifier: BSD-3-Clause +"""Project configuration owner for singleton project categories.""" + +from __future__ import annotations + +from easydiffraction.core.category_owner import CategoryOwner +from easydiffraction.project.categories.info import ProjectInfo +from easydiffraction.project.categories.info import ProjectInfoFactory +from easydiffraction.project.categories.rendering import Rendering +from easydiffraction.project.categories.rendering import RenderingFactory + + +class ProjectConfig(CategoryOwner): + """Own singleton project configuration categories.""" + + def __init__( + self, + name: str = 'untitled_project', + title: str = 'Untitled Project', + description: str = '', + ) -> None: + super().__init__() + self._info = ProjectInfoFactory.create( + ProjectInfoFactory.default_tag(), + name=name, + title=title, + description=description, + ) + self._rendering = RenderingFactory.create(RenderingFactory.default_tag()) + + @property + def info(self) -> ProjectInfo: + """Project metadata category.""" + return self._info + + @property + def rendering(self) -> Rendering: + """Rendering configuration category.""" + return self._rendering + + @property + def as_cif(self) -> str: + """Serialize singleton project categories to CIF.""" + from easydiffraction.io.cif.serialize import category_owner_to_cif # noqa: PLC0415 + + return category_owner_to_cif(self) \ No newline at end of file diff --git a/src/easydiffraction/project/project_info.py b/src/easydiffraction/project/project_info.py index 9bdfac7b..b0745b88 100644 --- a/src/easydiffraction/project/project_info.py +++ b/src/easydiffraction/project/project_info.py @@ -1,137 +1,7 @@ -# SPDX-FileCopyrightText: 2025 EasyScience contributors +# SPDX-FileCopyrightText: 2026 EasyScience contributors # SPDX-License-Identifier: BSD-3-Clause -"""Project metadata container used by Project.""" +"""Project metadata category export.""" from __future__ import annotations -import datetime -import pathlib - -from easydiffraction.core.guard import GuardedBase -from easydiffraction.io.cif.serialize import project_info_to_cif -from easydiffraction.utils.logging import console -from easydiffraction.utils.utils import render_cif - - -class ProjectInfo(GuardedBase): - """Store project metadata: name, title, description, paths.""" - - def __init__( - self, - name: str = 'untitled_project', - title: str = 'Untitled Project', - description: str = '', - ) -> None: - super().__init__() - - self._name = name - self._title = title - self._description = description - self._path: pathlib.Path | None = None # pathlib.Path.cwd() - self._created: datetime.datetime = datetime.datetime.now() - self._last_modified: datetime.datetime = datetime.datetime.now() - - @property - def name(self) -> str: - """Return the project name.""" - return self._name - - @name.setter - def name(self, value: str) -> None: - """ - Set the project name. - - Parameters - ---------- - value : str - New project name. - """ - self._name = value - - @property - def unique_name(self) -> str: - """Unique name for GuardedBase diagnostics.""" - return self.name - - @property - def title(self) -> str: - """Return the project title.""" - return self._title - - @title.setter - def title(self, value: str) -> None: - """ - Set the project title. - - Parameters - ---------- - value : str - New project title. - """ - self._title = value - - @property - def description(self) -> str: - """Return sanitized description with single spaces.""" - return ' '.join(self._description.split()) - - @description.setter - def description(self, value: str) -> None: - """ - Set the project description (whitespace normalized). - - Parameters - ---------- - value : str - New description text. - """ - self._description = ' '.join(value.split()) - - @property - def path(self) -> pathlib.Path | None: - """Return the project path as a Path object.""" - return self._path - - @path.setter - def path(self, value: object) -> None: - """ - Set the project directory path. - - Parameters - ---------- - value : object - New path as a :class:`str` or :class:`pathlib.Path`. - """ - # Accept str or Path; normalize to Path - self._path = pathlib.Path(value) - - @property - def created(self) -> datetime.datetime: - """Return the creation timestamp.""" - return self._created - - @property - def last_modified(self) -> datetime.datetime: - """Return the last modified timestamp.""" - return self._last_modified - - def update_last_modified(self) -> None: - """Update the last modified timestamp.""" - self._last_modified = datetime.datetime.now() - - def parameters(self) -> None: - """List parameters (not implemented).""" - - # TODO: Consider moving to io.cif.serialize - @property - def as_cif(self) -> str: - """Export project metadata to CIF.""" - return project_info_to_cif(self) - - # TODO: Consider moving to io.cif.serialize - def show_as_cif(self) -> None: - """Pretty-print CIF via shared utilities.""" - paragraph_title: str = f"Project 📦 '{self.name}' info as CIF" - cif_text: str = self.as_cif - console.paragraph(paragraph_title) - render_cif(cif_text) +from easydiffraction.project.categories.info.default import ProjectInfo From c225f75d6fbfd6f4863e5d1cc81a7a268305f7ff Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 13:29:28 +0200 Subject: [PATCH 10/15] Plan Project-to-Workspace rename --- .../adr_workspace-root-project-category.md | 327 ++++++ .../plan_workspace-root-project-category.md | 1042 +++++++++++++++++ 2 files changed, 1369 insertions(+) create mode 100644 docs/dev/ADR-suggestions/adr_workspace-root-project-category.md create mode 100644 docs/dev/plan_workspace-root-project-category.md diff --git a/docs/dev/ADR-suggestions/adr_workspace-root-project-category.md b/docs/dev/ADR-suggestions/adr_workspace-root-project-category.md new file mode 100644 index 00000000..88083794 --- /dev/null +++ b/docs/dev/ADR-suggestions/adr_workspace-root-project-category.md @@ -0,0 +1,327 @@ +# ADR: Workspace Root and Project Information Category + +**Status:** Proposed +**Date:** 2026-05-17 + +## Context + +The current public root object is `Project`. It acts as the top-level +facade for an EasyDiffraction working session: + +```python +project = ed.Project(name='lbco_hrpt') +project.structures +project.experiments +project.analysis +project.display +project.summary +project.save() +``` + +The same word, "project", is also the natural CIF category name for +information about the scientific project: + +```cif +_project.id +_project.title +_project.description +_project.created +_project.last_modified +``` + +This creates a naming conflict. The root object is a broad runtime +facade, while the `_project.*` category is only information about the +scientific project. Using the same name for both makes category naming +awkward: + +```python +project.project_info.title +project.config.project_info.title +project.project.title +``` + +At the same time, replacing `_project.*` with a generic `_meta.*` +category would weaken the CIF model: + +```cif +_meta.project_id +_meta.project_title +``` + +The category name `meta` is too generic. It forces each item name to +repeat what the category should already communicate. The existing +`_project.id` and `_project.title` names are more semantic and better +aligned with the repository rule to follow CIF naming unless there is a +clearly better API. + +The design question is therefore: + +- should the top-level runtime object remain `Project`, and project + information move to a different category such as `meta`; +- or should the top-level runtime object be renamed to `Workspace`, so + `project` can be used cleanly for project information? + +## Decision + +Rename the top-level runtime facade from `Project` to `Workspace`. + +Use `project` as the public project-information category under the +workspace: + +```python +workspace = ed.Workspace(project_id='lbco_hrpt') +workspace.project.id +workspace.project.title = 'La0.5Ba0.5CoO3 at HRPT@PSI' +workspace.rendering.table_engine = 'rich' +workspace.structures +workspace.experiments +workspace.analysis +``` + +Persist workspace-level singleton categories in `workspace.cif`: + +```cif +_project.id +_project.title +_project.description +_project.created +_project.last_modified + +_rendering.chart_engine +_rendering.table_engine +``` + +Do not introduce `_meta.*` CIF tags. + +The intended naming split is: + +```text +Workspace +|-- project # information about the scientific project +|-- rendering # rendering preferences +|-- structures # real structure datablocks +|-- experiments # real experiment datablocks +|-- analysis # analysis section +|-- display # display facade +`-- summary # summary/report facade +``` + +## Rationale + +### `Workspace` better describes the top-level facade + +The top-level object is more than project metadata. It owns active +collections, analysis state, display helpers, save/load behavior, and +runtime orchestration. `Workspace` describes that broader role without +consuming the domain word `project`. + +The name is also familiar in scientific software. It commonly means an +active analysis environment, a data/model container, or a working area. +That is close to the role of the current EasyDiffraction root object. + +### `project` is the right category name for project information + +Project information is not generic metadata. It is specifically the +identity, title, description, and timestamps of the scientific project. + +This reads cleanly: + +```python +workspace.project.title +``` + +and maps directly to clean CIF: + +```cif +_project.title +``` + +### `_meta.project_title` is weaker than `_project.title` + +The `_meta` category would make the CIF less domain-oriented. It also +creates longer and more repetitive item names: + +```cif +_meta.project_id +_meta.project_title +_meta.project_description +``` + +The existing `_project.*` tags are clearer: + +```cif +_project.id +_project.title +_project.description +``` + +### This keeps layer-specific consistency + +After this decision, each layer has a clear rule: + +| Layer | Rule | Example | +| --- | --- | --- | +| Runtime root | working-session facade | `Workspace` | +| Public category | semantic category name | `workspace.project` | +| CIF category | semantic CIF category | `_project.*` | +| Config file | workspace singleton categories | `workspace.cif` | + +This avoids one-off aliases such as `project.info` while preserving +semantic CIF names. + +## Consequences + +### Positive + +- The root object and project-information category no longer share the + same conceptual name. +- Public category access becomes uniform: + `workspace.project`, `workspace.rendering`, `workspace.analysis`. +- CIF stays semantic and does not introduce `_meta.*`. +- Project information can use short item names such as `id`, `title`, + and `description`. +- The top-level facade name better reflects active runtime orchestration. + +### Negative + +- This is a breaking public API change. +- Tutorials, scripts, tests, docs, type hints, and imports must be + updated from `Project` to `Workspace`. +- Existing saved directories using `project.cif` must be migrated to + `workspace.cif` if no compatibility loader is kept. +- Users familiar with `Project` must learn the new root name. +- `Workspace` can be confused with a filesystem workspace in some + ecosystems, so documentation must define it clearly as the active + EasyDiffraction working object. + +## Compatibility Policy + +EasyDiffraction is in beta, and repository instructions say not to keep +legacy shims by default. + +Therefore the target implementation should not add a permanent +`Project = Workspace` alias unless explicitly approved before +implementation. + +The migration plan still includes a review gate before removing the old +`Project` public symbol, because this is a user-facing breaking change. + +## Alternatives Considered + +### Keep `Project` root and rename project information to `meta` + +Rejected. + +Example: + +```python +project.meta.project_title +``` + +```cif +_meta.project_title +``` + +This keeps the root class stable, but it weakens the category model. +`meta` is too broad, and the CIF item names become repetitive. + +### Keep `Project` root and use `project.config.project` + +Rejected for now. + +Example: + +```python +project.config.project.title +``` + +This is technically consistent, but it still repeats `project` at +different semantic layers. It also adds depth to common user workflows. +It is a reasonable fallback if the public root rename is rejected. + +### Keep `Project` root and use `project.info` + +Rejected for the target design. + +Example: + +```python +project.info.title +``` + +This is readable, but it preserves a special-case category alias. The +current goal is stronger consistency between public categories and CIF +category concepts. + +### Rename only internal files and keep public API unchanged + +Rejected for the target design. + +This improves implementation clarity but does not solve the public +naming inconsistency. + +### Use `Study` instead of `Workspace` + +Rejected. + +`Study` is a plausible scientific term, but it is less established for +an active computational container. It also does not map as naturally to +save/load, display, and analysis orchestration. + +## Implementation Notes + +The implementation should follow: + +```text +docs/dev/plan_workspace-root-project-category.md +``` + +The high-level migration is: + +1. Rename the root facade `Project` to `Workspace`. +2. Rename the project package/module surface from `project` to + `workspace`. +3. Rename `ProjectConfig` to `WorkspaceConfig`. +4. Rename project-level category access from `info` to `project`. +5. Rename project-information `name` access to `id`, matching + `_project.id`. +6. Move the storage path to `Workspace.path`, because it describes the + saved workspace directory rather than project information. +7. Keep the information category class named `ProjectInfo` unless a + later decision chooses `ProjectMetadata`. +8. Keep CIF tags `_project.*` and `_rendering.*`. +9. Rename saved singleton config file from `project.cif` to + `workspace.cif`. +10. Update code, tests, scripts, tutorials, docs, and architecture + references. + +## Post-Implementation ADR Update + +This ADR must be updated after the migration plan is implemented. + +When implementation is complete: + +1. Change status from `Proposed` to `Accepted and implemented`. +2. Record the final public API and saved file layout. +3. Record whether a temporary or permanent `Project` compatibility alias + was approved. +4. Record any deviations from the migration plan. +5. Move this file from `docs/dev/ADR-suggestions/` to `docs/dev/ADRs/` + if that is the repository convention for accepted decisions. +6. Update `docs/dev/architecture.md`. +7. Update or close related items in `docs/dev/Issues/issues_open.md`. + +## Acceptance Criteria + +This ADR is satisfied when: + +- `ed.Workspace` is the public top-level facade. +- `ed.Project` is removed unless explicitly approved as an alias. +- the public project-information category is `workspace.project`. +- project identity is exposed as `workspace.project.id`. +- the saved directory path is exposed as `workspace.path`. +- the public rendering category is `workspace.rendering`. +- saved singleton configuration lives in `workspace.cif`. +- `workspace.cif` uses `_project.*` and `_rendering.*` tags. +- no `_meta.*` tags are introduced for project information. +- tutorials and architecture documentation use `Workspace`. diff --git a/docs/dev/plan_workspace-root-project-category.md b/docs/dev/plan_workspace-root-project-category.md new file mode 100644 index 00000000..3cfbbfa1 --- /dev/null +++ b/docs/dev/plan_workspace-root-project-category.md @@ -0,0 +1,1042 @@ +# Workspace Root and Project Category Migration Plan + +## Status + +Branch: `feature/workspace-root-project-category` + +ADR suggestion: + +```text +docs/dev/ADR-suggestions/adr_workspace-root-project-category.md +``` + +Two-phase workflow from `.github/copilot-instructions.md`: + +- Phase 1 - Implementation. Code, docs, and architecture updates only. + Do not create or run tests unless the user explicitly asks. +- Phase 2 - Verification. Add/update tests, then run the verification + commands listed near the end of this plan. + +Stop after Phase 1 and request review before starting Phase 2. + +Status checklist. Mark `[x]` only while implementing: + +```text +Phase 1 - Implementation +[ ] Phase 0: Confirm breaking-change approval. +[ ] Phase 1: Rename root package and public facade to Workspace. +[ ] Phase 2: Rename project-info access from info to project. +[ ] Phase 3: Align project information fields with _project.* tags. +[ ] Phase 4: Rename project config file to workspace.cif. +[ ] Phase 5: Update root-object references across runtime code. +[ ] Phase 6: Update docs, tutorials, and ADR references. +[ ] Phase 7: Remove old public Project surface unless approved. +[ ] Phase 1 review gate: present diff for approval. + +Phase 2 - Verification +[ ] Move/update unit tests to workspace paths. +[ ] Add workspace naming and CIF layout tests. +[ ] pixi run test-structure-check +[ ] pixi run fix +[ ] pixi run check +[ ] pixi run unit-tests +[ ] pixi run integration-tests +[ ] pixi run script-tests +[ ] pixi run notebook-prepare +[ ] pixi run notebook-tests +``` + +## Commit Discipline + +When an AI agent follows this plan, every completed Phase 1 +implementation step must be staged with explicit paths and committed +locally before moving to the next implementation step or to the Phase 1 +review gate. + +Follow the **Commits** section of `.github/copilot-instructions.md`. + +Rules: + +- One commit per phase. +- Keep each commit atomic and single-purpose. +- Stage explicit paths only. Do not use `git add .`. +- Use `git mv` for file and directory moves. +- Do not stage unrelated user changes. +- Do not stage generated artifacts unless the user explicitly asks. +- If a serious uncovered design issue appears, stop and ask before + continuing. + +Suggested commit messages: + +```text +Rename Project facade to Workspace +Expose project information as workspace.project +Align project metadata fields with CIF names +Rename project config file to workspace.cif +Update runtime references to Workspace +Update docs for Workspace root API +Remove old Project public API surface +``` + +## Goal + +Split the name "project" into two distinct concepts: + +1. `Workspace` - the top-level runtime facade, currently named + `Project`. +2. `workspace.project` - the category that stores information about the + scientific project. + +Target public API: + +```python +import easydiffraction as ed + +workspace = ed.Workspace(project_id='lbco_hrpt') +workspace.project.id +workspace.project.title = 'La0.5Ba0.5CoO3 at HRPT@PSI' +workspace.rendering.table_engine = 'rich' +workspace.structures +workspace.experiments +workspace.analysis +workspace.display +workspace.summary +workspace.save_as('lbco_hrpt') +``` + +Target workspace-level config file: + +```text +workspace.cif +``` + +Target CIF tags inside `workspace.cif`: + +```cif +_project.id +_project.title +_project.description +_project.created +_project.last_modified + +_rendering.chart_engine +_rendering.table_engine +``` + +Do not introduce `_meta.*` tags. + +## Decisions Already Made + +Use these decisions unless the user explicitly changes the ADR before +implementation: + +- The public root class becomes `Workspace`. +- The public root import becomes `from easydiffraction import Workspace`. +- The public project-information category becomes `workspace.project`. +- The public rendering category remains `workspace.rendering`. +- The project-information category keeps semantic CIF tags `_project.*`. +- The rendering category keeps semantic CIF tags `_rendering.*`. +- The saved singleton config file becomes `workspace.cif`. +- The storage directory path belongs to `Workspace.path`, not + `workspace.project.path`. +- The old `Project` public API is removed unless the user explicitly + approves an alias before implementation. + +## Current Shape + +The current implementation already has a category-owner based project +configuration layer: + +```text +src/easydiffraction/project/ +|-- project.py # class Project +|-- project_config.py # class ProjectConfig +|-- project_info.py # ProjectInfo export +|-- display.py # class ProjectDisplay +`-- categories/ + |-- info/ # ProjectInfo category + `-- rendering/ # Rendering category +``` + +Current public API: + +```python +project = ed.Project(name='my_project') +project.info.title +project.rendering.table_engine +``` + +Current saved config file: + +```text +project.cif +``` + +## Target Shape + +Target implementation: + +```text +src/easydiffraction/workspace/ +|-- workspace.py # class Workspace +|-- workspace_config.py # class WorkspaceConfig +|-- project_info.py # ProjectInfo export +|-- display.py # class WorkspaceDisplay +`-- categories/ + |-- project/ # ProjectInfo category + `-- rendering/ # Rendering category +``` + +Target public API: + +```python +workspace = ed.Workspace(project_id='my_project') +workspace.project.title +workspace.rendering.table_engine +``` + +## Out Of Scope + +Do not do these in this migration: + +- Do not add `_meta.*` CIF tags. +- Do not redesign structure or experiment datablocks. +- Do not change analysis fit-mode semantics. +- Do not change calculator behavior. +- Do not edit generated package-structure docs by hand. +- Do not edit generated notebooks directly. Edit tutorial `.py` sources + and regenerate notebooks during Phase 2. +- Do not keep a `Project = Workspace` compatibility alias unless the + user explicitly approves it. + +## Phase 0: Confirm Breaking-Change Approval + +This migration removes or replaces the public `Project` API unless the +user approves a compatibility alias. + +Before changing code, ask the user to confirm: + +```text +This migration removes ed.Project and replaces it with ed.Workspace. +Should implementation proceed without a Project compatibility alias? +``` + +If the user asks for a compatibility alias, record that decision in this +plan and in the ADR before implementation. + +Do not implement code before this approval gate. + +Commit: no commit required for this phase unless the plan or ADR is +updated. + +## Phase 1: Rename Root Package And Public Facade + +### Objective + +Rename the top-level runtime facade and package from `project` to +`workspace`. + +### Files Likely To Change + +- `src/easydiffraction/project/` +- `src/easydiffraction/__init__.py` +- `src/easydiffraction/__main__.py` +- `src/easydiffraction/analysis/analysis.py` +- `src/easydiffraction/analysis/sequential.py` +- `src/easydiffraction/display/plotting.py` +- `src/easydiffraction/summary/summary.py` +- any source file importing `easydiffraction.project.*` + +### Steps + +1. Move the source package: + + ```text + src/easydiffraction/project/ + -> src/easydiffraction/workspace/ + ``` + +2. Rename files: + + ```text + workspace/project.py + -> workspace/workspace.py + + workspace/project_config.py + -> workspace/workspace_config.py + ``` + +3. Rename classes: + + ```text + Project -> Workspace + ProjectConfig -> WorkspaceConfig + ProjectDisplay -> WorkspaceDisplay + ``` + +4. Update top-level import: + + ```python + from easydiffraction.workspace.workspace import Workspace + ``` + +5. Remove the old top-level `Project` import unless the user approved an + alias. + +6. Update type-checking imports: + + ```python + from easydiffraction.workspace.workspace import Workspace + ``` + +7. Update docstrings from "Project facade" to "Workspace facade" where + they describe the root object. + +8. Run a source-only grep. Do not run tests in Phase 1: + + ```shell + rg -n "easydiffraction\\.project|\\bProject\\b|ProjectDisplay|ProjectConfig" src + ``` + + For every match, decide whether it refers to: + + - the old root object, which should become `Workspace`; + - the project-information category, which should remain project; + - historical text that should be updated in docs later. + +### Stop Conditions + +Stop and ask if: + +- another public class named `Workspace` already exists; +- package moves break imports in a way that would require compatibility + shims; +- a file has both root-object `project` and category `project` meanings + that cannot be separated clearly. + +### Commit + +Stage explicit moved and edited files, then commit: + +```text +Rename Project facade to Workspace +``` + +## Phase 2: Rename Project-Info Access From `info` To `project` + +### Objective + +Make the project-information category public as `workspace.project` +instead of `workspace.info`. + +### Files Likely To Change + +- `src/easydiffraction/workspace/workspace_config.py` +- `src/easydiffraction/workspace/workspace.py` +- `src/easydiffraction/workspace/categories/info/` +- `src/easydiffraction/workspace/project_info.py` +- `src/easydiffraction/io/cif/serialize.py` +- all code using `.info` for project information + +### Steps + +1. Rename category package: + + ```text + src/easydiffraction/workspace/categories/info/ + -> src/easydiffraction/workspace/categories/project/ + ``` + +2. Keep the category class name `ProjectInfo` for now. The class name is + explicit and avoids a confusing `Project` class after the root class + is renamed to `Workspace`. + +3. Rename imports: + + ```python + from easydiffraction.workspace.categories.project import ProjectInfo + from easydiffraction.workspace.categories.project import ProjectInfoFactory + ``` + +4. In `WorkspaceConfig`, rename: + + ```text + _info -> _project + info -> project + ``` + +5. In `Workspace`, rename: + + ```text + _info -> _project + info -> project + ``` + +6. Remove the public `.info` property unless the user approved a + compatibility alias. + +7. Update all runtime references: + + ```text + workspace.info.title -> workspace.project.title + workspace.info.description -> workspace.project.description + workspace.info.update_last_modified() -> workspace.project.update_last_modified() + ``` + +8. Run grep: + + ```shell + rg -n "\\.info\\b|categories/info|categories\\.info" src + ``` + +9. For every match, update it if it refers to project information. + Leave unrelated uses of the word "info" alone. + +### Stop Conditions + +Stop and ask if: + +- `info` appears as a different public concept unrelated to project + information; +- removing `.info` would break a user-requested compatibility alias. + +### Commit + +```text +Expose project information as workspace.project +``` + +## Phase 3: Align Project Information Fields With `_project.*` + +### Objective + +Expose project identity as `workspace.project.id`, matching CIF +`_project.id`. + +Move the saved directory path to `workspace.path`, because it describes +the workspace location and is not serialized project information. + +### Files Likely To Change + +- `src/easydiffraction/workspace/categories/project/default.py` +- `src/easydiffraction/workspace/workspace.py` +- `src/easydiffraction/io/cif/serialize.py` +- `src/easydiffraction/summary/summary.py` +- `src/easydiffraction/display/plotting.py` +- any code using `.name` for project identity or `.project.path` + +### Steps + +1. In `ProjectInfo`, rename the public identity property: + + ```text + name -> id + ``` + +2. Keep the underlying CIF tag unchanged: + + ```python + CifHandler(names=['_project.id']) + ``` + +3. Update `ProjectInfo.unique_name` to return `self.id`. + +4. Update `project_info_to_cif()` and CIF loading helpers to use + `info.id`. + +5. Rename constructor arguments: + + ```text + name -> project_id + ``` + + Apply this to: + + - `Workspace.__init__` + - `WorkspaceConfig.__init__` + - `ProjectInfo.__init__` + - `ProjectInfoFactory.create(...)` call sites + +6. Add `Workspace.path` as the runtime storage path. + + Suggested shape: + + ```python + @property + def path(self) -> pathlib.Path | None: + """Saved workspace directory.""" + return self._path + + @path.setter + def path(self, value: object) -> None: + self._path = pathlib.Path(value) + ``` + +7. Remove `ProjectInfo.path` unless explicitly approved as a + compatibility alias. + +8. Update save/load logic: + + ```text + workspace.path + ``` + + should replace: + + ```text + workspace.project.path + ``` + +9. Update messages and string representations: + + ```text + Workspace '' (...) + Saving workspace '' to ... + ``` + +10. Run grep: + + ```shell + rg -n "\\.name\\b|\\.path\\b|project_id|Project identifier" src/easydiffraction/workspace src/easydiffraction/io src/easydiffraction/display src/easydiffraction/summary + ``` + + Inspect each match manually. Do not blindly replace every `.name`; + structures and experiments still use `.name`. + +### Stop Conditions + +Stop and ask if: + +- a caller depends on `workspace.name` as a root-object property; +- moving `path` out of `ProjectInfo` makes save/load unclear; +- external saved fixtures require an approved compatibility path. + +### Commit + +```text +Align project metadata fields with CIF names +``` + +## Phase 4: Rename Project Config File To `workspace.cif` + +### Objective + +Rename the saved singleton configuration file from `project.cif` to +`workspace.cif`. + +### Files Likely To Change + +- `src/easydiffraction/workspace/workspace.py` +- `src/easydiffraction/io/cif/serialize.py` +- CLI entry points in `src/easydiffraction/__main__.py` +- docs that describe saved project directories +- test fixtures in Phase 2 + +### Steps + +1. Rename serializer functions if they still use project-root naming: + + ```text + project_config_to_cif -> workspace_config_to_cif + project_config_from_cif -> workspace_config_from_cif + project_to_cif -> workspace_to_cif + ``` + + Do not rename `project_info_to_cif`; it serializes the `_project` + category and that name remains correct. + +2. Update `Workspace.save()` to write: + + ```text + workspace.cif + ``` + +3. Update `Workspace.load()` to read: + + ```text + workspace.cif + ``` + +4. Do not add `project.cif` fallback unless the user approved a + compatibility loader. + +5. Keep the contents semantic: + + ```cif + _project.id + _project.title + _rendering.table_engine + ``` + +6. Update logging and console output from `project.cif` to + `workspace.cif`. + +7. Run grep: + + ```shell + rg -n "project\\.cif|project_config_to_cif|project_config_from_cif|project_to_cif" src docs tests + ``` + + In Phase 1, update source and docs only. Test files are handled in + Phase 2 unless the user explicitly asks otherwise. + +### Stop Conditions + +Stop and ask if: + +- repository fixtures or tutorials contain saved directories that must + remain loadable without conversion; +- the user wants a one-release compatibility loader. + +### Commit + +```text +Rename project config file to workspace.cif +``` + +## Phase 5: Update Root-Object References Across Runtime Code + +### Objective + +Replace root-object variables and attributes named `project` with +`workspace` where they refer to the top-level facade. + +Keep the word `project` where it refers to the project-information +category or the scientific project itself. + +### Files Likely To Change + +- `src/easydiffraction/analysis/analysis.py` +- `src/easydiffraction/analysis/sequential.py` +- `src/easydiffraction/display/plotting.py` +- `src/easydiffraction/project/display.py` after it has moved to + `workspace/display.py` +- `src/easydiffraction/summary/summary.py` +- `src/easydiffraction/__main__.py` +- `src/easydiffraction/io/*` + +### Steps + +1. Rename root references in `Analysis`: + + ```text + self.project -> self.workspace + analysis.project -> analysis.workspace + ``` + +2. Rename display internals: + + ```text + self._project -> self._workspace + _set_project(...) -> _set_workspace(...) + ``` + + Only do this when the object is the top-level runtime facade. + +3. Rename local variables in runtime code: + + ```text + project = Workspace(...) + -> workspace = Workspace(...) + ``` + +4. Keep scientific-project wording where appropriate: + + ```text + workspace.project.title + project_id + _project.id + ``` + +5. Update user-facing messages carefully. Good examples: + + ```text + "Workspace directory not found" + "Saving workspace" + "Project title" + ``` + +6. Run grep: + + ```shell + rg -n "\\bproject\\b|\\bProject\\b|_project|ProjectDisplay" src/easydiffraction + ``` + +7. Inspect each match. Do not replace `_project` CIF tags. + +### Stop Conditions + +Stop and ask if: + +- a name has both root-workspace and project-information meanings in the + same function and cannot be made clear; +- renaming a method such as `_set_project` would require updating public + plugin or user code. + +### Commit + +```text +Update runtime references to Workspace +``` + +## Phase 6: Update Docs, Tutorials, And ADR References + +### Objective + +Update user-facing and developer-facing documentation to describe the new +root object and project-information category. + +### Files Likely To Change + +- `docs/dev/architecture.md` +- `docs/dev/Issues/issues_open.md` +- `docs/dev/ADRs/*.md` +- `docs/dev/ADR-suggestions/*.md` +- `docs/docs/tutorials/*.py` +- `README.md` +- `CONTRIBUTING.md` only if it contains API examples + +Do not edit these by hand: + +- `docs/dev/package-structure-full.md` +- `docs/dev/package-structure-short.md` +- generated tutorial notebooks +- generated `docs/site/` files + +### Steps + +1. Update architecture section 7: + + ```text + Project - The Top-Level Facade + -> Workspace - The Top-Level Facade + ``` + +2. Update the architecture table to use: + + ```text + workspace.project ProjectInfo + workspace.rendering Rendering + workspace.display WorkspaceDisplay + ``` + +3. Update saved layout examples: + + ```text + workspace.cif + structures/ + experiments/ + analysis/ + summary.cif + ``` + +4. Update public examples: + + ```python + workspace = ed.Workspace(project_id='lbco_hrpt') + workspace.project.title = '...' + ``` + +5. Update ADRs that describe current API. Historical reasoning can keep + old names only if it is clearly historical and not presented as + current usage. + +6. Update tutorial `.py` files, not notebooks. Phase 2 will run + `pixi run notebook-prepare`. + +7. Run grep: + + ```shell + rg -n "ed\\.Project|from easydiffraction import Project|project\\.info|project\\.rendering|ProjectDisplay|project\\.cif" docs README.md CONTRIBUTING.md + ``` + +8. Inspect each match manually. + +### Stop Conditions + +Stop and ask if: + +- a tutorial title uses "Project" as ordinary English rather than API + naming; +- historical ADRs would become misleading if edited mechanically. + +### Commit + +```text +Update docs for Workspace root API +``` + +## Phase 7: Remove Old Public `Project` Surface Unless Approved + +### Objective + +Finish the breaking rename by removing old public imports and module +paths unless the user approved compatibility. + +### Steps Without Compatibility Alias + +1. Ensure top-level `easydiffraction.__init__` exports `Workspace`, not + `Project`. + +2. Ensure no source imports from: + + ```text + easydiffraction.project + ``` + +3. Ensure no public source package remains at: + + ```text + src/easydiffraction/project + ``` + +4. Run grep: + + ```shell + rg -n "from easydiffraction import Project|ed\\.Project|easydiffraction\\.project|\\bProject\\(" src docs tests tools README.md CONTRIBUTING.md + ``` + +5. Any remaining match must be: + + - historical text that intentionally names the old API; or + - a test that will be updated in Phase 2; or + - a generated artifact that should not be edited manually. + +### Steps With Approved Compatibility Alias + +Only do this if the user explicitly approved it. + +1. Add a temporary alias in `src/easydiffraction/__init__.py`: + + ```python + Project = Workspace + ``` + +2. Keep the alias undocumented unless the user asks for a migration note. + +3. Add tests in Phase 2 proving both `Workspace` and `Project` construct + the same root object. + +### Commit + +Without alias: + +```text +Remove old Project public API surface +``` + +With alias: + +```text +Add Project alias for Workspace migration +``` + +## Phase 1 Review Gate + +After Phase 1 commits are complete: + +1. Run `git status --short`. +2. Confirm only intended files are changed. +3. Summarize: + + - whether `Project` was removed or aliased; + - whether `workspace.cif` replaced `project.cif`; + - any files intentionally left for Phase 2 test updates; + - any unresolved questions. + +4. Stop and ask the user to review before starting Phase 2. + +Do not run the full verification suite until the user approves moving to +Phase 2. + +## Phase 2: Verification And Tests + +Only start this phase after the user approves the Phase 1 implementation. + +### Test Updates + +Move or update tests to mirror the new source tree: + +```text +tests/unit/easydiffraction/project/ +-> tests/unit/easydiffraction/workspace/ +``` + +Update imports: + +```python +from easydiffraction.workspace.workspace import Workspace +from easydiffraction.workspace.display import WorkspaceDisplay +``` + +Update functional and integration tests: + +```python +from easydiffraction import Workspace +workspace = Workspace(project_id='...') +``` + +### New Tests To Add + +Add focused tests for: + +1. `from easydiffraction import Workspace`. +2. `Workspace(project_id='p1').project.id == 'p1'`. +3. `workspace.project.title` round-trips through `workspace.cif`. +4. `workspace.rendering.table_engine` round-trips through + `workspace.cif`. +5. `Workspace.save()` writes `workspace.cif`. +6. `Workspace.load()` reads `workspace.cif`. +7. `workspace.cif` contains `_project.id`, not `_meta.project_id`. +8. `workspace.cif` contains `_rendering.table_engine`. +9. `workspace.path` is set after `save_as()` and `load()`. +10. `workspace.project` has no serialized path field. +11. `project.cif` is not written unless compatibility was approved. +12. `ed.Project` is absent unless compatibility was approved. + +If compatibility alias was approved, add tests for: + +1. `from easydiffraction import Project`. +2. `Project is Workspace` or equivalent behavior. +3. Any approved `project.cif` fallback behavior. + +### Verification Commands + +Run in this order: + +```shell +pixi run test-structure-check +pixi run fix +pixi run check +pixi run unit-tests +pixi run integration-tests +pixi run script-tests +pixi run notebook-prepare +pixi run notebook-tests +``` + +If `pixi run fix` regenerates package-structure docs, accept those +generated changes and do not hand-edit them. + +### Phase 2 Commit Suggestions + +Use one or more commits, depending on size: + +```text +Update workspace unit tests +Update tutorials for Workspace API +Regenerate tutorial notebooks for Workspace API +``` + +## Grep Checklist + +Use this checklist before final review. + +Runtime root object should use `Workspace`: + +```shell +rg -n "\\bProject\\b|ed\\.Project|from easydiffraction import Project" src tests docs tools README.md CONTRIBUTING.md +``` + +Project-information category should use `workspace.project`: + +```shell +rg -n "\\.info\\b|workspace\\.project|project\\.info" src tests docs tools README.md CONTRIBUTING.md +``` + +CIF project category should stay `_project`: + +```shell +rg -n "_meta\\.|_project\\." src tests docs tools README.md CONTRIBUTING.md +``` + +Saved config file should be `workspace.cif`: + +```shell +rg -n "project\\.cif|workspace\\.cif" src tests docs tools README.md CONTRIBUTING.md +``` + +Generated docs should not be manually edited: + +```shell +git diff -- docs/site docs/dev/package-structure-full.md docs/dev/package-structure-short.md +``` + +If package-structure docs changed because of `pixi run fix`, that is +expected. If `docs/site` changed, ask before staging. + +## Common Mistakes + +### Mistake: Renaming `_project.*` To `_workspace.*` + +Do not do this. The CIF category describes scientific project +information, not the runtime facade. + +Correct: + +```cif +_project.id +_project.title +``` + +Incorrect: + +```cif +_workspace.project_id +_workspace.title +``` + +### Mistake: Introducing `_meta.*` + +Do not replace `_project.*` with `_meta.*`. + +Correct: + +```cif +_project.title +``` + +Incorrect: + +```cif +_meta.project_title +``` + +### Mistake: Blindly Replacing Every `project` + +Some uses of `project` should remain: + +- CIF tags such as `_project.id` +- `workspace.project` +- scientific project wording in prose +- `ProjectInfo` class name, unless a later ADR changes it + +Only root-facade uses should become `workspace` or `Workspace`. + +### Mistake: Leaving Path On Project Information + +The saved directory path belongs to the workspace runtime state. It +should be `workspace.path`, not `workspace.project.path`. + +### Mistake: Editing Generated Notebooks Directly + +Tutorial notebooks are generated artifacts. Edit tutorial `.py` files, +then run `pixi run notebook-prepare` in Phase 2. + +## Suggested Pull Request + +Title: + +```text +Rename Project root object to Workspace +``` + +Description: + +```text +This change separates the working EasyDiffraction workspace from the +scientific project information stored inside it. Users now create a +Workspace, while project title and description live under +workspace.project and continue to serialize with clear _project.* CIF +names. +``` From 212cda20236bdedcb8b596d1c8071b0c89d3ebc9 Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 14:24:22 +0200 Subject: [PATCH 11/15] Finalize CategoryOwner refactor --- .../adr_category-owner-sections.md | 20 +- .../adr_workspace-root-project-category.md | 21 +- docs/dev/Issues/issues_closed.md | 4 +- docs/dev/architecture.md | 35 ++- docs/dev/package-structure-full.md | 11 +- docs/dev/package-structure-short.md | 6 + docs/dev/plan_category-owner-sections.md | 202 +++++++++--------- .../plan_workspace-root-project-category.md | 31 ++- src/easydiffraction/analysis/analysis.py | 2 +- src/easydiffraction/core/category_owner.py | 6 +- src/easydiffraction/core/variable.py | 6 +- src/easydiffraction/io/cif/serialize.py | 2 +- .../project/categories/info/__init__.py | 2 +- .../project/categories/info/default.py | 19 +- .../project/categories/info/factory.py | 2 +- .../project/categories/rendering/default.py | 5 +- src/easydiffraction/project/project.py | 7 +- src/easydiffraction/project/project_config.py | 2 +- src/easydiffraction/project/project_info.py | 4 +- .../core/test_category_owner.py | 134 ++++++++++++ .../test_serialize_category_owner_baseline.py | 2 +- .../project/categories/info/test_default.py | 63 ++++++ .../project/categories/info/test_factory.py | 32 +++ .../project/test_project_config.py | 68 ++++++ 24 files changed, 512 insertions(+), 174 deletions(-) create mode 100644 tests/unit/easydiffraction/core/test_category_owner.py create mode 100644 tests/unit/easydiffraction/project/categories/info/test_default.py create mode 100644 tests/unit/easydiffraction/project/categories/info/test_factory.py create mode 100644 tests/unit/easydiffraction/project/test_project_config.py diff --git a/docs/dev/ADR-suggestions/adr_category-owner-sections.md b/docs/dev/ADR-suggestions/adr_category-owner-sections.md index 8778af48..bf13940d 100644 --- a/docs/dev/ADR-suggestions/adr_category-owner-sections.md +++ b/docs/dev/ADR-suggestions/adr_category-owner-sections.md @@ -10,8 +10,8 @@ The current architecture has two real datablock families: - structures - experiments -These are real datablocks because each instance maps to a CIF `data_` -block and has a datablock entry name. This is reflected in +These are real datablocks because each instance maps to a CIF +`data_` block and has a datablock entry name. This is reflected in `DatablockItem`, `DatablockCollection`, and `datablock_item_to_cif()`. `Analysis` and project-level configuration are different. They own @@ -85,8 +85,8 @@ GuardedBase ### 1. A real datablock must emit a CIF `data_` header -Only objects that serialize as independent CIF data blocks should inherit -datablock-specific behavior. +Only objects that serialize as independent CIF data blocks should +inherit datablock-specific behavior. Current real datablocks: @@ -216,8 +216,8 @@ Rejected. This would be the smallest code change, but it would make "datablock" mean both real CIF data blocks and singleton project sections. It would -also encourage fake identities such as `datablock_entry_name = -"analysis"`. +also encourage fake identities such as +`datablock_entry_name = "analysis"`. ### Add `emit_data_header = False` to `DatablockItem` @@ -273,12 +273,12 @@ When implementation is complete: 1. Change status from `Proposed` to `Accepted and implemented`. 2. Update the date if the project convention requires the implementation date. -3. Replace tentative wording such as "should" and "target hierarchy" with - the actual final design. +3. Replace tentative wording such as "should" and "target hierarchy" + with the actual final design. 4. Record any deviations from the migration plan. 5. Link to the implementation PR or commit if available. -6. Move this file from `docs/dev/ADR-suggestions/` to `docs/dev/ADRs/` if - that is the repository convention for accepted decisions. +6. Move this file from `docs/dev/ADR-suggestions/` to `docs/dev/ADRs/` + if that is the repository convention for accepted decisions. 7. Update `docs/dev/architecture.md`. 8. Update or close the related issue in `docs/dev/Issues/issues_open.md` (move to `docs/dev/Issues/issues_closed.md` on full resolution). diff --git a/docs/dev/ADR-suggestions/adr_workspace-root-project-category.md b/docs/dev/ADR-suggestions/adr_workspace-root-project-category.md index 88083794..b53ead95 100644 --- a/docs/dev/ADR-suggestions/adr_workspace-root-project-category.md +++ b/docs/dev/ADR-suggestions/adr_workspace-root-project-category.md @@ -159,12 +159,12 @@ _project.description After this decision, each layer has a clear rule: -| Layer | Rule | Example | -| --- | --- | --- | -| Runtime root | working-session facade | `Workspace` | -| Public category | semantic category name | `workspace.project` | -| CIF category | semantic CIF category | `_project.*` | -| Config file | workspace singleton categories | `workspace.cif` | +| Layer | Rule | Example | +| --------------- | ------------------------------ | ------------------- | +| Runtime root | working-session facade | `Workspace` | +| Public category | semantic category name | `workspace.project` | +| CIF category | semantic CIF category | `_project.*` | +| Config file | workspace singleton categories | `workspace.cif` | This avoids one-off aliases such as `project.info` while preserving semantic CIF names. @@ -175,12 +175,13 @@ semantic CIF names. - The root object and project-information category no longer share the same conceptual name. -- Public category access becomes uniform: - `workspace.project`, `workspace.rendering`, `workspace.analysis`. +- Public category access becomes uniform: `workspace.project`, + `workspace.rendering`, `workspace.analysis`. - CIF stays semantic and does not introduce `_meta.*`. - Project information can use short item names such as `id`, `title`, and `description`. -- The top-level facade name better reflects active runtime orchestration. +- The top-level facade name better reflects active runtime + orchestration. ### Negative @@ -293,7 +294,7 @@ The high-level migration is: 9. Rename saved singleton config file from `project.cif` to `workspace.cif`. 10. Update code, tests, scripts, tutorials, docs, and architecture - references. + references. ## Post-Implementation ADR Update diff --git a/docs/dev/Issues/issues_closed.md b/docs/dev/Issues/issues_closed.md index a634932c..0e6db4dc 100644 --- a/docs/dev/Issues/issues_closed.md +++ b/docs/dev/Issues/issues_closed.md @@ -69,8 +69,8 @@ category owners. `DatablockItem` now extends `CategoryOwner`, keeping real `data_` header behavior for structures and experiments. `Analysis` also extends `CategoryOwner`, reusing shared category discovery, parameter aggregation, and dirty tracking while remaining a -singleton section without a `data_` header. CIF serialization now -splits category-body rendering from datablock header rendering via +singleton section without a `data_` header. CIF serialization now splits +category-body rendering from datablock header rendering via `category_owner_to_cif()`. --- diff --git a/docs/dev/architecture.md b/docs/dev/architecture.md index 7259c48c..768531e7 100644 --- a/docs/dev/architecture.md +++ b/docs/dev/architecture.md @@ -191,14 +191,13 @@ execution order within a datablock (e.g. background before data). `CategoryOwner` is the shared base class for objects that own flat category siblings. It provides category discovery, category sorting by `_update_priority`, parameter aggregation, `_need_categories_update` -tracking, and category-body CIF serialization without a `data_` -header. +tracking, and category-body CIF serialization without a `data_` header. -`DatablockItem` extends `CategoryOwner` for real CIF `data_` -blocks. `Structure` and `ExperimentBase` subclasses are real -datablocks. `Analysis` is also a `CategoryOwner`, but it serializes as -a singleton section body in `analysis/analysis.cif` and does not emit a -fake `data_analysis` header. +`DatablockItem` extends `CategoryOwner` for real CIF `data_` blocks. +`Structure` and `ExperimentBase` subclasses are real datablocks. +`Analysis` is also a `CategoryOwner`, but it serializes as a singleton +section body in `analysis/analysis.cif` and does not emit a fake +`data_analysis` header. | Aspect | `DatablockItem` | `DatablockCollection` | | ------------------ | ------------------------------------------- | -------------------------------------------------------- | @@ -861,13 +860,13 @@ workflow: `Analysis` is bound to a `Project` and provides the high-level API: - Singleton section: `Analysis` is a `CategoryOwner`, not a - `DatablockItem`. It owns sibling categories and serializes as the - body of `analysis/analysis.cif` without a `data_` header. -- Fit configuration: `fitting` (`CategoryItem` with - `minimizer_type`). `fitting.minimizer_type` selects the minimizer - backend. The active fitting mode lives on the owner as - `analysis.fitting_mode_type`, not as a nested child category field. - `fitting.show_minimizer_types()` lists supported minimizers. + `DatablockItem`. It owns sibling categories and serializes as the body + of `analysis/analysis.cif` without a `data_` header. +- Fit configuration: `fitting` (`CategoryItem` with `minimizer_type`). + `fitting.minimizer_type` selects the minimizer backend. The active + fitting mode lives on the owner as `analysis.fitting_mode_type`, not + as a nested child category field. `fitting.show_minimizer_types()` + lists supported minimizers. - Joint-fit weights: `joint_fit` (`CategoryCollection` of per-experiment weight entries); sibling of `fitting`, not a child. - Fit results: `analysis.fit_results` stores the latest runtime result @@ -938,10 +937,10 @@ It owns and coordinates all components: | `project.verbosity` | `str` | Console output level (full/short/silent) | Internally, `Project` keeps project-scoped singleton categories under a -private `ProjectConfig(CategoryOwner)` object. That owner currently holds -`project.info` (`ProjectInfo`) and `project.rendering` (`Rendering`), -while the public access paths stay flat on `Project` for user -discoverability. +private `ProjectConfig(CategoryOwner)` object. That owner currently +holds `project.info` (`ProjectInfo`) and `project.rendering` +(`Rendering`), while the public access paths stay flat on `Project` for +user discoverability. ### 7.1 Data Flow diff --git a/docs/dev/package-structure-full.md b/docs/dev/package-structure-full.md index c1728f1b..720505ba 100644 --- a/docs/dev/package-structure-full.md +++ b/docs/dev/package-structure-full.md @@ -122,6 +122,8 @@ │ ├── 📄 category.py │ │ ├── 🏷️ class CategoryItem │ │ └── 🏷️ class CategoryCollection +│ ├── 📄 category_owner.py +│ │ └── 🏷️ class CategoryOwner │ ├── 📄 collection.py │ │ └── 🏷️ class CollectionBase │ ├── 📄 datablock.py @@ -425,6 +427,12 @@ │ └── 📄 ascii.py ├── 📁 project │ ├── 📁 categories +│ │ ├── 📁 info +│ │ │ ├── 📄 __init__.py +│ │ │ ├── 📄 default.py +│ │ │ │ └── 🏷️ class ProjectInfo +│ │ │ └── 📄 factory.py +│ │ │ └── 🏷️ class ProjectInfoFactory │ │ ├── 📁 rendering │ │ │ ├── 📄 __init__.py │ │ │ ├── 📄 default.py @@ -441,8 +449,9 @@ │ │ └── 🏷️ class ProjectDisplay │ ├── 📄 project.py │ │ └── 🏷️ class Project +│ ├── 📄 project_config.py +│ │ └── 🏷️ class ProjectConfig │ └── 📄 project_info.py -│ └── 🏷️ class ProjectInfo ├── 📁 summary │ ├── 📄 __init__.py │ └── 📄 summary.py diff --git a/docs/dev/package-structure-short.md b/docs/dev/package-structure-short.md index 4c1c62f8..da42f7a5 100644 --- a/docs/dev/package-structure-short.md +++ b/docs/dev/package-structure-short.md @@ -64,6 +64,7 @@ ├── 📁 core │ ├── 📄 __init__.py │ ├── 📄 category.py +│ ├── 📄 category_owner.py │ ├── 📄 collection.py │ ├── 📄 datablock.py │ ├── 📄 diagnostic.py @@ -207,6 +208,10 @@ │ └── 📄 ascii.py ├── 📁 project │ ├── 📁 categories +│ │ ├── 📁 info +│ │ │ ├── 📄 __init__.py +│ │ │ ├── 📄 default.py +│ │ │ └── 📄 factory.py │ │ ├── 📁 rendering │ │ │ ├── 📄 __init__.py │ │ │ ├── 📄 default.py @@ -215,6 +220,7 @@ │ ├── 📄 __init__.py │ ├── 📄 display.py │ ├── 📄 project.py +│ ├── 📄 project_config.py │ └── 📄 project_info.py ├── 📁 summary │ ├── 📄 __init__.py diff --git a/docs/dev/plan_category-owner-sections.md b/docs/dev/plan_category-owner-sections.md index 3fdd03ef..9a62c63e 100644 --- a/docs/dev/plan_category-owner-sections.md +++ b/docs/dev/plan_category-owner-sections.md @@ -10,8 +10,8 @@ Two-phase workflow (see `.github/copilot-instructions.md`): Phase 0 baseline characterization tests below are an explicit exception (they describe pre-existing behavior so the refactor can be verified mechanically). Do not add new feature tests during Phase 1. -- Phase 2 — Verification. Add the per-phase tests listed below, then - run the verification commands in the "Phase 2 Verification" section. +- Phase 2 — Verification. Add the per-phase tests listed below, then run + the verification commands in the "Phase 2 Verification" section. Stop after Phase 1 and request review before starting Phase 2. @@ -48,7 +48,8 @@ the **Commits** section of `.github/copilot-instructions.md`. - One commit per step. Atomic and single-purpose. - Stage explicit paths only — do not `git add .`. -- Suggested commit messages (≤72 chars, imperative mood, no type prefix): +- Suggested commit messages (≤72 chars, imperative mood, no type + prefix): - `Add baseline category-owner characterization tests` - `Add CategoryOwner base class` - `Make DatablockItem inherit CategoryOwner` @@ -62,21 +63,22 @@ the **Commits** section of `.github/copilot-instructions.md`. ## Purpose -This plan explains how to make `Analysis` and project-level configuration -reuse the same category-management behavior as real datablocks without making -them real datablocks. +This plan explains how to make `Analysis` and project-level +configuration reuse the same category-management behavior as real +datablocks without making them real datablocks. The current code has two real datablock types: - `Structure` - `Experiment` -Those are real datablocks because they represent CIF `data_` blocks and -have a user-defined block ID. +Those are real datablocks because they represent CIF `data_` blocks +and have a user-defined block ID. -`Analysis` and project-level configuration are different. They own CIF-like -categories, but they are singleton project sections. They should not require a -datablock ID and should not serialize as `data_analysis` or `data_project`. +`Analysis` and project-level configuration are different. They own +CIF-like categories, but they are singleton project sections. They +should not require a datablock ID and should not serialize as +`data_analysis` or `data_project`. The target design is: @@ -94,31 +96,28 @@ GuardedBase `-- ProjectConfig # optional follow-up ``` -`CategoryOwner` contains shared behavior for objects that own flat categories. -`DatablockItem` keeps the additional behavior that only real CIF data blocks -need. +`CategoryOwner` contains shared behavior for objects that own flat +categories. `DatablockItem` keeps the additional behavior that only real +CIF data blocks need. ## Important Definitions Use these terms consistently during the migration. -`DatablockItem` -: A real CIF data block. It serializes with a `data_` header. It has a - `datablock_entry_name`. Examples: `Structure`, `ExperimentBase`. +`DatablockItem` : A real CIF data block. It serializes with a +`data_` header. It has a `datablock_entry_name`. Examples: +`Structure`, `ExperimentBase`. -`CategoryOwner` -: An object that owns flat sibling categories and can update, enumerate, and - serialize those categories. It does not necessarily have a `data_` header. - Examples after migration: `DatablockItem`, `Analysis`, optional - `ProjectConfig`. +`CategoryOwner` : An object that owns flat sibling categories and can +update, enumerate, and serialize those categories. It does not +necessarily have a `data_` header. Examples after migration: +`DatablockItem`, `Analysis`, optional `ProjectConfig`. -`CategoryItem` -: A single category row. Example: `Cell`, `SpaceGroup`, `Fitting`, - `Rendering`. +`CategoryItem` : A single category row. Example: `Cell`, `SpaceGroup`, +`Fitting`, `Rendering`. -`CategoryCollection` -: A loop-style category collection. Example: `AtomSites`, `Aliases`, - `JointFitCollection`. +`CategoryCollection` : A loop-style category collection. Example: +`AtomSites`, `Aliases`, `JointFitCollection`. ## Non-Negotiable Rules @@ -130,15 +129,16 @@ Follow these rules throughout the migration: 4. Do not emit `data_project` in the saved `project.cif` file. 5. Do not rename CIF tags during this migration. 6. Do not change public access paths unless a phase explicitly says so. -7. Keep `project.parameters` limited to fit-relevant structure and experiment - parameters unless a separate design decision changes that later. -8. Keep inactive analysis categories accessible unless the fit-mode policy is - intentionally changed in a separate task. +7. Keep `project.parameters` limited to fit-relevant structure and + experiment parameters unless a separate design decision changes that + later. +8. Keep inactive analysis categories accessible unless the fit-mode + policy is intentionally changed in a separate task. ## Recommended Branching Strategy -Use several small pull requests or commits. Each phase should be independently -reviewable. +Use several small pull requests or commits. Each phase should be +independently reviewable. Suggested split: @@ -151,13 +151,13 @@ Suggested split: 7. Optional project-config cleanup. 8. Documentation cleanup. -Do not combine phases 4, 5, and 6 into one large change. If something breaks, -small phases make the source of the break obvious. +Do not combine phases 4, 5, and 6 into one large change. If something +breaks, small phases make the source of the break obvious. ## Phase 0: Baseline Safety Tests -Before changing production code, add or confirm tests that describe current -behavior. +Before changing production code, add or confirm tests that describe +current behavior. ### Files To Read First @@ -226,8 +226,8 @@ src/easydiffraction/core/category_owner.py ### Responsibility -`CategoryOwner` should own the behavior currently shared by any object that has -flat categories: +`CategoryOwner` should own the behavior currently shared by any object +that has flat categories: - category discovery - category sorting by `_update_priority` @@ -305,12 +305,14 @@ Keep it simple: ### Tests For Phase 1 -Create fake category owner tests. Do not use `Structure` or `Analysis` yet. +Create fake category owner tests. Do not use `Structure` or `Analysis` +yet. Test: 1. `CategoryOwner.categories` finds direct `CategoryItem` attributes. -2. `CategoryOwner.categories` finds direct `CategoryCollection` attributes. +2. `CategoryOwner.categories` finds direct `CategoryCollection` + attributes. 3. Categories are sorted by `_update_priority`. 4. `CategoryOwner.parameters` aggregates parameters from categories. 5. `_update_categories()` calls category `_update()` methods. @@ -381,8 +383,8 @@ pixi run unit-tests tests/unit/easydiffraction/datablocks pixi run unit-tests tests/unit/easydiffraction/io/cif ``` -If failures happen, check parent linkage first. Most failures in this phase are -likely caused by categories not having the expected `_parent`. +If failures happen, check parent linkage first. Most failures in this +phase are likely caused by categories not having the expected `_parent`. ## Phase 3: Split CIF Body Serialization From Datablock Serialization @@ -399,7 +401,8 @@ src/easydiffraction/io/cif/serialize.py 1. Adds the `data_` header. 2. Serializes category content. -Only real datablocks need job 1. `Analysis` and project config need job 2. +Only real datablocks need job 1. `Analysis` and project config need +job 2. ### Add `category_owner_to_cif()` @@ -439,12 +442,13 @@ def category_owner_to_cif( return "\n\n".join([part for part in item_parts + collection_parts if part]) ``` -Keep the current item-first, collection-second ordering unless a separate test -and design decision changes it. +Keep the current item-first, collection-second ordering unless a +separate test and design decision changes it. ### Update `datablock_item_to_cif()` -After adding `category_owner_to_cif()`, reduce `datablock_item_to_cif()` to: +After adding `category_owner_to_cif()`, reduce `datablock_item_to_cif()` +to: ```python header = f"data_{datablock._identity.datablock_entry_name}" @@ -498,8 +502,9 @@ Call `super().__init__()` at the start of `Analysis.__init__()`. ### Parent Linkage -Because `Analysis` will now be a `GuardedBase` subclass, private assignment of -`GuardedBase` children should normally set `_parent` automatically. +Because `Analysis` will now be a `GuardedBase` subclass, private +assignment of `GuardedBase` children should normally set `_parent` +automatically. Still verify these categories have `_parent is analysis`: @@ -510,8 +515,8 @@ Still verify these categories have `_parent is analysis`: - `analysis.sequential_fit` - `analysis.sequential_fit_extract` -Some are currently assigned manually and some may not be. Make the result -consistent. +Some are currently assigned manually and some may not be. Make the +result consistent. ### Keep Existing Public API @@ -527,9 +532,9 @@ project.analysis.sequential_fit_extract project.analysis.fitting_mode_type ``` -If direct public attributes currently exist without properties, convert them -carefully. Less experienced agents should prefer a compatibility-preserving -change: +If direct public attributes currently exist without properties, convert +them carefully. Less experienced agents should prefer a +compatibility-preserving change: 1. Keep the public access path. 2. Add properties only when needed. @@ -622,9 +627,9 @@ def _update_categories( self.constraints_handler.apply() ``` -If this changes behavior, use the existing constraint logic and leave a short -comment explaining why shared category updates are intentionally skipped or -ordered differently. +If this changes behavior, use the existing constraint logic and leave a +short comment explaining why shared category updates are intentionally +skipped or ordered differently. ### Tests For Phase 4 @@ -632,7 +637,8 @@ Add tests for: 1. `isinstance(analysis, CategoryOwner)`. 2. `analysis.categories` includes analysis categories. -3. `analysis.parameters` includes fitting and analysis config descriptors. +3. `analysis.parameters` includes fitting and analysis config + descriptors. 4. Parent links are set on all analysis categories. 5. `analysis.as_cif` still does not start with `data_`. 6. `analysis.as_cif` still includes `_fitting.mode_type`. @@ -657,11 +663,11 @@ src/easydiffraction/core/variable.py ### Current Behavior -Descriptors currently find a `DatablockItem` ancestor when marking an owner -dirty after value changes. +Descriptors currently find a `DatablockItem` ancestor when marking an +owner dirty after value changes. -That is too narrow after this migration. `Analysis` will also be a category -owner. +That is too narrow after this migration. `Analysis` will also be a +category owner. ### Target Behavior @@ -735,8 +741,8 @@ serializes via shared `CategoryOwner` helpers, and the public ### Low-Risk Version -If converting `ProjectInfo` into a `CategoryItem` is too disruptive, do not do -it immediately. +If converting `ProjectInfo` into a `CategoryItem` is too disruptive, do +not do it immediately. Instead: @@ -769,10 +775,10 @@ timestamp in the `ProjectInfo` category. ### Save/Load Rule -Even after this cleanup, saved `project.cif` should remain a section file -without an explicit `data_` header. The loader currently wraps project config -with `data_project` before parsing, and that can remain an implementation -detail. +Even after this cleanup, saved `project.cif` should remain a section +file without an explicit `data_` header. The loader currently wraps +project config with `data_project` before parsing, and that can remain +an implementation detail. ## Phase 7: Documentation Updates @@ -830,11 +836,11 @@ Explain that: When the migration is complete, update issue 5 in `docs/dev/Issues/issues_open.md`. -If fully complete, remove the issue and add a note to closed issues if that is -the project convention. +If fully complete, remove the issue and add a note to closed issues if +that is the project convention. -If only `CategoryOwner` is introduced but `Analysis` is not migrated yet, -change the issue text to reflect the remaining work. +If only `CategoryOwner` is introduced but `Analysis` is not migrated +yet, change the issue text to reflect the remaining work. ## Final Acceptance Criteria @@ -852,7 +858,8 @@ The migration is done when all of these are true: 10. Inactive analysis categories are not serialized. 11. Dirty-flag marking works for structures, experiments, and analysis. 12. Project save/load format remains compatible. -13. Documentation distinguishes real datablocks from category-owning sections. +13. Documentation distinguishes real datablocks from category-owning + sections. ## Common Mistakes To Avoid @@ -864,8 +871,8 @@ Do not solve this by setting: self._identity.datablock_entry_name = lambda: "analysis" ``` -That makes parameter identities and CIF output imply that analysis is a real -datablock. It is not. +That makes parameter identities and CIF output imply that analysis is a +real datablock. It is not. ### Mistake: Adding A Boolean Flag To `DatablockItem` @@ -881,25 +888,26 @@ This works mechanically but weakens the model. The class name ### Mistake: Changing Fit Parameter Semantics -Do not change `project.parameters` to include analysis parameters during this -migration. Analysis parameters are configuration parameters, not model -parameters for fitting. +Do not change `project.parameters` to include analysis parameters during +this migration. Analysis parameters are configuration parameters, not +model parameters for fitting. ### Mistake: Refactoring Public API While Moving Base Classes Do not rename `fitting_mode_type`, `joint_fit`, `sequential_fit`, -`rendering`, or `info` during this migration. Naming cleanup can happen later. +`rendering`, or `info` during this migration. Naming cleanup can happen +later. ### Mistake: Assuming `vars(owner)` Order Is A Design Contract If serialization order matters, write tests for it. Prefer explicit -`_serializable_categories()` hooks for owners that need a specific active -subset. +`_serializable_categories()` hooks for owners that need a specific +active subset. ## Minimal Agent Checklist -The authoritative checklist lives in the **Status** section at the top of -this document. Keep it updated as work progresses. +The authoritative checklist lives in the **Status** section at the top +of this document. Keep it updated as work progresses. ## Phase 2 Verification @@ -936,22 +944,22 @@ and re-run `pixi run check` until clean. **Description (end-user oriented):** -This change reorganizes how the library models the different pieces of -a project that own CIF-like categories. Crystal structures and -experiments stay "real" CIF data blocks — each one is saved with its -own `data_` header, exactly as before. The `Analysis` section -(fitting mode, aliases, constraints, joint/sequential fit settings) and -project-level configuration are now treated as singleton sections that -share the same convenient features (category discovery, parameter -listing, dirty tracking, `help()` tables) without pretending to be data -blocks. As a result: +This change reorganizes how the library models the different pieces of a +project that own CIF-like categories. Crystal structures and experiments +stay "real" CIF data blocks — each one is saved with its own `data_` +header, exactly as before. The `Analysis` section (fitting mode, +aliases, constraints, joint/sequential fit settings) and project-level +configuration are now treated as singleton sections that share the same +convenient features (category discovery, parameter listing, dirty +tracking, `help()` tables) without pretending to be data blocks. As a +result: - Saved CIF files keep their current layout: structures and experiments start with `data_`; analysis and project configuration stay as section files without a fake `data_` header. -- `project.parameters`, `analysis.parameters`, and `structure.parameters` - behave consistently and discoverably. +- `project.parameters`, `analysis.parameters`, and + `structure.parameters` behave consistently and discoverably. - Future project metadata cleanup can reuse the same base class. -No user-facing API names change in this PR; existing tutorials, -scripts, and saved projects continue to work. +No user-facing API names change in this PR; existing tutorials, scripts, +and saved projects continue to work. diff --git a/docs/dev/plan_workspace-root-project-category.md b/docs/dev/plan_workspace-root-project-category.md index 3cfbbfa1..b57e318a 100644 --- a/docs/dev/plan_workspace-root-project-category.md +++ b/docs/dev/plan_workspace-root-project-category.md @@ -131,7 +131,8 @@ Use these decisions unless the user explicitly changes the ADR before implementation: - The public root class becomes `Workspace`. -- The public root import becomes `from easydiffraction import Workspace`. +- The public root import becomes + `from easydiffraction import Workspace`. - The public project-information category becomes `workspace.project`. - The public rendering category remains `workspace.rendering`. - The project-information category keeps semantic CIF tags `_project.*`. @@ -299,7 +300,6 @@ Rename the top-level runtime facade and package from `project` to ``` For every match, decide whether it refers to: - - the old root object, which should become `Workspace`; - the project-information category, which should remain project; - historical text that should be updated in docs later. @@ -389,8 +389,8 @@ instead of `workspace.info`. rg -n "\\.info\\b|categories/info|categories\\.info" src ``` -9. For every match, update it if it refers to project information. - Leave unrelated uses of the word "info" alone. +9. For every match, update it if it refers to project information. Leave + unrelated uses of the word "info" alone. ### Stop Conditions @@ -451,7 +451,6 @@ the workspace location and is not serialized project information. ``` Apply this to: - - `Workspace.__init__` - `WorkspaceConfig.__init__` - `ProjectInfo.__init__` @@ -496,12 +495,12 @@ the workspace location and is not serialized project information. 10. Run grep: - ```shell - rg -n "\\.name\\b|\\.path\\b|project_id|Project identifier" src/easydiffraction/workspace src/easydiffraction/io src/easydiffraction/display src/easydiffraction/summary - ``` +```shell +rg -n "\\.name\\b|\\.path\\b|project_id|Project identifier" src/easydiffraction/workspace src/easydiffraction/io src/easydiffraction/display src/easydiffraction/summary +``` - Inspect each match manually. Do not blindly replace every `.name`; - structures and experiments still use `.name`. +Inspect each match manually. Do not blindly replace every `.name`; +structures and experiments still use `.name`. ### Stop Conditions @@ -683,8 +682,8 @@ Update runtime references to Workspace ### Objective -Update user-facing and developer-facing documentation to describe the new -root object and project-information category. +Update user-facing and developer-facing documentation to describe the +new root object and project-information category. ### Files Likely To Change @@ -797,7 +796,6 @@ paths unless the user approved compatibility. ``` 5. Any remaining match must be: - - historical text that intentionally names the old API; or - a test that will be updated in Phase 2; or - a generated artifact that should not be edited manually. @@ -812,7 +810,8 @@ Only do this if the user explicitly approved it. Project = Workspace ``` -2. Keep the alias undocumented unless the user asks for a migration note. +2. Keep the alias undocumented unless the user asks for a migration + note. 3. Add tests in Phase 2 proving both `Workspace` and `Project` construct the same root object. @@ -838,7 +837,6 @@ After Phase 1 commits are complete: 1. Run `git status --short`. 2. Confirm only intended files are changed. 3. Summarize: - - whether `Project` was removed or aliased; - whether `workspace.cif` replaced `project.cif`; - any files intentionally left for Phase 2 test updates; @@ -851,7 +849,8 @@ Phase 2. ## Phase 2: Verification And Tests -Only start this phase after the user approves the Phase 1 implementation. +Only start this phase after the user approves the Phase 1 +implementation. ### Test Updates diff --git a/src/easydiffraction/analysis/analysis.py b/src/easydiffraction/analysis/analysis.py index cb0ccbfe..08303f65 100644 --- a/src/easydiffraction/analysis/analysis.py +++ b/src/easydiffraction/analysis/analysis.py @@ -487,7 +487,7 @@ def _help_filter( return filtered_properties, methods def _serializable_categories(self) -> list: - """Analysis categories that should be written for the active fit mode.""" + """Serializable analysis categories for the active fit mode.""" categories = [ self.fitting, self.aliases, diff --git a/src/easydiffraction/core/category_owner.py b/src/easydiffraction/core/category_owner.py index 94f43231..09e0cf68 100644 --- a/src/easydiffraction/core/category_owner.py +++ b/src/easydiffraction/core/category_owner.py @@ -17,7 +17,9 @@ def __init__(self) -> None: @property def categories(self) -> list: - """All category objects owned by this object, sorted by priority.""" + """ + All category objects owned by this object, sorted by priority. + """ categories = [ value for value in vars(self).values() @@ -71,4 +73,4 @@ def help(self) -> None: columns_headers=['Category', 'Type', '# Parameters'], columns_alignment=['left', 'left', 'right'], columns_data=rows, - ) \ No newline at end of file + ) diff --git a/src/easydiffraction/core/variable.py b/src/easydiffraction/core/variable.py index bb4b03f5..257a5771 100644 --- a/src/easydiffraction/core/variable.py +++ b/src/easydiffraction/core/variable.py @@ -355,9 +355,9 @@ def _set_value_user_constrained(self, v: object) -> None: """ Set the value from a constraint expression. - Bypasses validation and marks the parent category owner dirty, like - ``_set_value_from_minimizer``, because constraints are applied - inside the minimizer loop where trial values may exceed + Bypasses validation and marks the parent category owner dirty, + like ``_set_value_from_minimizer``, because constraints are + applied inside the minimizer loop where trial values may exceed physical-range validators. Flags the parameter as user constrained. Used exclusively by ``ConstraintsHandler.apply()``. """ diff --git a/src/easydiffraction/io/cif/serialize.py b/src/easydiffraction/io/cif/serialize.py index 89d970d5..f7625330 100644 --- a/src/easydiffraction/io/cif/serialize.py +++ b/src/easydiffraction/io/cif/serialize.py @@ -327,7 +327,7 @@ def datablock_item_to_cif( body = category_owner_to_cif(datablock, max_loop_display=max_loop_display) if not body: return header - return '\n\n'.join([header, body]) + return f'{header}\n\n{body}' def datablock_collection_to_cif(collection: object) -> str: diff --git a/src/easydiffraction/project/categories/info/__init__.py b/src/easydiffraction/project/categories/info/__init__.py index 34ab82b9..5b464da2 100644 --- a/src/easydiffraction/project/categories/info/__init__.py +++ b/src/easydiffraction/project/categories/info/__init__.py @@ -5,4 +5,4 @@ from __future__ import annotations from easydiffraction.project.categories.info.default import ProjectInfo -from easydiffraction.project.categories.info.factory import ProjectInfoFactory \ No newline at end of file +from easydiffraction.project.categories.info.factory import ProjectInfoFactory diff --git a/src/easydiffraction/project/categories/info/default.py b/src/easydiffraction/project/categories/info/default.py index 98be48e3..1d7b8e31 100644 --- a/src/easydiffraction/project/categories/info/default.py +++ b/src/easydiffraction/project/categories/info/default.py @@ -37,8 +37,8 @@ def __init__( ) -> None: super().__init__() - created = datetime.datetime.now() - last_modified = datetime.datetime.now() + created = datetime.datetime.now(tz=datetime.UTC) + last_modified = datetime.datetime.now(tz=datetime.UTC) self._project_id = StringDescriptor( name='id', @@ -77,12 +77,21 @@ def __init__( @staticmethod def _parse_timestamp(value: str) -> datetime.datetime: """Parse project timestamp text from CIF storage format.""" - return datetime.datetime.strptime(value, _PROJECT_TIMESTAMP_FORMAT) + return datetime.datetime.strptime(value, _PROJECT_TIMESTAMP_FORMAT).replace( + tzinfo=datetime.UTC, + ) + + @staticmethod + def _normalize_timestamp(value: datetime.datetime) -> datetime.datetime: + """Return timestamps as UTC-aware datetimes.""" + if value.tzinfo is None: + return value.replace(tzinfo=datetime.UTC) + return value.astimezone(datetime.UTC) @staticmethod def _format_timestamp(value: datetime.datetime) -> str: """Format a project timestamp for CIF storage.""" - return value.strftime(_PROJECT_TIMESTAMP_FORMAT) + return ProjectInfo._normalize_timestamp(value).strftime(_PROJECT_TIMESTAMP_FORMAT) @property def unique_name(self) -> str: @@ -163,4 +172,4 @@ def show_as_cif(self) -> None: """Pretty-print CIF via shared utilities.""" paragraph_title = f"Project 📦 '{self.name}' info as CIF" console.paragraph(paragraph_title) - render_cif(self.as_cif) \ No newline at end of file + render_cif(self.as_cif) diff --git a/src/easydiffraction/project/categories/info/factory.py b/src/easydiffraction/project/categories/info/factory.py index 2dc5ef08..1a6bccb4 100644 --- a/src/easydiffraction/project/categories/info/factory.py +++ b/src/easydiffraction/project/categories/info/factory.py @@ -14,4 +14,4 @@ class ProjectInfoFactory(FactoryBase): _default_rules: ClassVar[dict] = { frozenset(): 'default', - } \ No newline at end of file + } diff --git a/src/easydiffraction/project/categories/rendering/default.py b/src/easydiffraction/project/categories/rendering/default.py index cdca6f75..b55a6f92 100644 --- a/src/easydiffraction/project/categories/rendering/default.py +++ b/src/easydiffraction/project/categories/rendering/default.py @@ -83,9 +83,12 @@ def table_engine(self, value: str) -> None: @property def plotter(self) -> Plotter: """Live plotting facade bound to the owning project.""" - owner = getattr(self, '_parent', None) + direct_parent = getattr(self, '_parent', None) + owner = direct_parent while owner is not None and not hasattr(owner, 'structures'): owner = getattr(owner, '_parent', None) + if owner is None: + owner = direct_parent if owner is not None: self._plotter._set_project(owner) return self._plotter diff --git a/src/easydiffraction/project/project.py b/src/easydiffraction/project/project.py index 43113199..5bbe8acb 100644 --- a/src/easydiffraction/project/project.py +++ b/src/easydiffraction/project/project.py @@ -6,6 +6,7 @@ import pathlib import tempfile +from typing import TYPE_CHECKING from typing import ClassVar from typeguard import typechecked @@ -17,15 +18,17 @@ from easydiffraction.datablocks.structure.collection import Structures from easydiffraction.io.cif.serialize import project_config_to_cif from easydiffraction.io.cif.serialize import project_to_cif -from easydiffraction.project.categories.rendering import Rendering from easydiffraction.project.display import ProjectDisplay -from easydiffraction.project.project_info import ProjectInfo from easydiffraction.project.project_config import ProjectConfig from easydiffraction.summary.summary import Summary from easydiffraction.utils.enums import VerbosityEnum from easydiffraction.utils.logging import console from easydiffraction.utils.logging import log +if TYPE_CHECKING: + from easydiffraction.project.categories.rendering import Rendering + from easydiffraction.project.project_info import ProjectInfo + def _apply_csv_row_to_params( row: object, diff --git a/src/easydiffraction/project/project_config.py b/src/easydiffraction/project/project_config.py index f0f7437b..32147b60 100644 --- a/src/easydiffraction/project/project_config.py +++ b/src/easydiffraction/project/project_config.py @@ -44,4 +44,4 @@ def as_cif(self) -> str: """Serialize singleton project categories to CIF.""" from easydiffraction.io.cif.serialize import category_owner_to_cif # noqa: PLC0415 - return category_owner_to_cif(self) \ No newline at end of file + return category_owner_to_cif(self) diff --git a/src/easydiffraction/project/project_info.py b/src/easydiffraction/project/project_info.py index b0745b88..215e8ed5 100644 --- a/src/easydiffraction/project/project_info.py +++ b/src/easydiffraction/project/project_info.py @@ -4,4 +4,6 @@ from __future__ import annotations -from easydiffraction.project.categories.info.default import ProjectInfo +from easydiffraction.project.categories.info.default import ProjectInfo as _ProjectInfo + +ProjectInfo = _ProjectInfo diff --git a/tests/unit/easydiffraction/core/test_category_owner.py b/tests/unit/easydiffraction/core/test_category_owner.py new file mode 100644 index 00000000..264f01bb --- /dev/null +++ b/tests/unit/easydiffraction/core/test_category_owner.py @@ -0,0 +1,134 @@ +# SPDX-FileCopyrightText: 2026 EasyScience contributors +# SPDX-License-Identifier: BSD-3-Clause + +from __future__ import annotations + +from easydiffraction.core.category import CategoryItem +from easydiffraction.core.category_owner import CategoryOwner +from easydiffraction.core.validation import AttributeSpec +from easydiffraction.core.variable import Parameter +from easydiffraction.io.cif.handler import CifHandler +from easydiffraction.io.cif.serialize import category_owner_to_cif + + +class _FastCategory(CategoryItem): + _update_priority = 1 + + def __init__(self, update_calls: list[tuple[str, bool]] | None = None) -> None: + super().__init__() + self._identity.category_code = 'fast' + self._update_calls = update_calls + self._param = Parameter( + name='param', + description='Fast category parameter', + value_spec=AttributeSpec(default=0.0), + units='', + cif_handler=CifHandler(names=['_fast.param']), + ) + + @property + def param(self) -> Parameter: + return self._param + + def _update(self, *, called_by_minimizer: bool = False) -> None: + if self._update_calls is not None: + self._update_calls.append(('fast', called_by_minimizer)) + + +class _SlowCategory(CategoryItem): + _update_priority = 20 + + def __init__(self, update_calls: list[tuple[str, bool]] | None = None) -> None: + super().__init__() + self._identity.category_code = 'slow' + self._update_calls = update_calls + self._param = Parameter( + name='param', + description='Slow category parameter', + value_spec=AttributeSpec(default=0.0), + units='', + cif_handler=CifHandler(names=['_slow.param']), + ) + + @property + def param(self) -> Parameter: + return self._param + + def _update(self, *, called_by_minimizer: bool = False) -> None: + if self._update_calls is not None: + self._update_calls.append(('slow', called_by_minimizer)) + + +class _Owner(CategoryOwner): + def __init__(self, update_calls: list[tuple[str, bool]] | None = None) -> None: + super().__init__() + self._slow = _SlowCategory(update_calls) + self._fast = _FastCategory(update_calls) + + @property + def fast(self) -> _FastCategory: + return self._fast + + @property + def slow(self) -> _SlowCategory: + return self._slow + + @property + def as_cif(self) -> str: + return category_owner_to_cif(self) + + +class _OwnerWithSerializableSubset(_Owner): + def _serializable_categories(self) -> list: + return [self.slow] + + +def test_category_owner_sorts_categories_and_flattens_parameters(): + owner = _Owner() + + assert owner.fast._parent is owner + assert owner.slow._parent is owner + assert [category._identity.category_code for category in owner.categories] == ['fast', 'slow'] + assert [parameter.unique_name for parameter in owner.parameters] == [ + 'fast.param', + 'slow.param', + ] + + +def test_category_owner_updates_only_when_needed_and_can_force_minimizer_updates(): + update_calls: list[tuple[str, bool]] = [] + owner = _Owner(update_calls) + + owner._update_categories() + + assert update_calls == [('fast', False), ('slow', False)] + assert owner._need_categories_update is False + + update_calls.clear() + owner._update_categories() + assert update_calls == [] + + owner._update_categories(called_by_minimizer=True) + assert update_calls == [('fast', True), ('slow', True)] + assert owner._need_categories_update is False + + +def test_category_owner_descriptor_changes_mark_owner_dirty(): + owner = _Owner() + owner._need_categories_update = False + + owner.fast.param.value = 1.5 + assert owner._need_categories_update is True + + owner._need_categories_update = False + owner.slow.param._set_value_from_minimizer(2.5) + assert owner._need_categories_update is True + + +def test_category_owner_as_cif_respects_serializable_categories_override(): + owner = _OwnerWithSerializableSubset() + + cif_text = owner.as_cif + + assert '_slow.param' in cif_text + assert '_fast.param' not in cif_text diff --git a/tests/unit/easydiffraction/io/cif/test_serialize_category_owner_baseline.py b/tests/unit/easydiffraction/io/cif/test_serialize_category_owner_baseline.py index e5aaab09..badd7241 100644 --- a/tests/unit/easydiffraction/io/cif/test_serialize_category_owner_baseline.py +++ b/tests/unit/easydiffraction/io/cif/test_serialize_category_owner_baseline.py @@ -94,4 +94,4 @@ def test_real_analysis_as_cif_includes_sequential_sections_only_in_sequential_mo assert '_sequential_fit_extract.id' in analysis_cif assert '_sequential_fit_extract.target' in analysis_cif assert 'diffrn.ambient_temperature' in analysis_cif - assert '_joint_fit.experiment_id' not in analysis_cif \ No newline at end of file + assert '_joint_fit.experiment_id' not in analysis_cif diff --git a/tests/unit/easydiffraction/project/categories/info/test_default.py b/tests/unit/easydiffraction/project/categories/info/test_default.py new file mode 100644 index 00000000..b480ca16 --- /dev/null +++ b/tests/unit/easydiffraction/project/categories/info/test_default.py @@ -0,0 +1,63 @@ +# SPDX-FileCopyrightText: 2026 EasyScience contributors +# SPDX-License-Identifier: BSD-3-Clause + +from __future__ import annotations + +import datetime + +import gemmi + + +def test_project_info_defaults_and_identity(): + from easydiffraction.project.categories.info.default import ProjectInfo + + info = ProjectInfo(name='beer', title='Beer title', description='Some description') + + assert info.type_info.tag == 'default' + assert info._identity.category_code == 'project' + assert info.name == 'beer' + assert info.title == 'Beer title' + assert info.description == 'Some description' + assert info.path is None + assert isinstance(info.created, datetime.datetime) + assert isinstance(info.last_modified, datetime.datetime) + assert info.created.tzinfo is not None + assert info.last_modified.tzinfo is not None + + +def test_project_info_setters_and_from_cif_restore_fields(): + from easydiffraction.project.categories.info.default import ProjectInfo + + info = ProjectInfo() + info.description = 'Some spaced\n description' + info.path = 'project-dir' + + assert info.description == 'Some spaced description' + assert info.path is not None + assert info.path.name == 'project-dir' + + block = gemmi.cif.read_string( + """data_test +_project.id beer +_project.title 'Beer title' +_project.description 'Some description' +_project.created '17 May 2026 11:13:21' +_project.last_modified '17 May 2026 11:13:51' +""", + ).sole_block() + + info.from_cif(block) + + assert info.name == 'beer' + assert info.title == 'Beer title' + assert info.description == 'Some description' + assert info.created == datetime.datetime(2026, 5, 17, 11, 13, 21, tzinfo=datetime.UTC) + assert info.last_modified == datetime.datetime( + 2026, + 5, + 17, + 11, + 13, + 51, + tzinfo=datetime.UTC, + ) diff --git a/tests/unit/easydiffraction/project/categories/info/test_factory.py b/tests/unit/easydiffraction/project/categories/info/test_factory.py new file mode 100644 index 00000000..9010e2f8 --- /dev/null +++ b/tests/unit/easydiffraction/project/categories/info/test_factory.py @@ -0,0 +1,32 @@ +# SPDX-FileCopyrightText: 2026 EasyScience contributors +# SPDX-License-Identifier: BSD-3-Clause + +from __future__ import annotations + +import pytest + + +def test_project_info_factory_default_and_create(): + from easydiffraction.project.categories.info.default import ProjectInfo + from easydiffraction.project.categories.info.factory import ProjectInfoFactory + + assert ProjectInfoFactory.default_tag() == 'default' + assert 'default' in ProjectInfoFactory.supported_tags() + + info = ProjectInfoFactory.create( + 'default', + name='beer', + title='Beer title', + description='Some description', + ) + + assert isinstance(info, ProjectInfo) + assert info.name == 'beer' + assert info.title == 'Beer title' + + +def test_project_info_factory_rejects_unknown_tag(): + from easydiffraction.project.categories.info.factory import ProjectInfoFactory + + with pytest.raises(ValueError, match=r"Unsupported type: 'missing'"): + ProjectInfoFactory.create('missing') diff --git a/tests/unit/easydiffraction/project/test_project_config.py b/tests/unit/easydiffraction/project/test_project_config.py new file mode 100644 index 00000000..3ba20eea --- /dev/null +++ b/tests/unit/easydiffraction/project/test_project_config.py @@ -0,0 +1,68 @@ +# SPDX-FileCopyrightText: 2026 EasyScience contributors +# SPDX-License-Identifier: BSD-3-Clause + +from __future__ import annotations + +import datetime + + +def test_project_config_exposes_project_info_and_rendering_categories(): + from easydiffraction.core.category_owner import CategoryOwner + from easydiffraction.project.project_config import ProjectConfig + from easydiffraction.project.project_info import ProjectInfo + + config = ProjectConfig(name='beer', title='Beer title', description='Some description') + + assert isinstance(config, CategoryOwner) + assert isinstance(config.info, ProjectInfo) + assert config.info._parent is config + assert config.rendering._parent is config + assert config.info.name == 'beer' + assert config.info.title == 'Beer title' + assert config.info.description == 'Some description' + assert config.info.path is None + assert isinstance(config.info.created, datetime.datetime) + assert isinstance(config.info.last_modified, datetime.datetime) + assert config.categories == [config.info, config.rendering] + assert config.parameters == config.info.parameters + config.rendering.parameters + + +def test_project_config_as_cif_has_project_and_rendering_sections_without_data_header(): + from easydiffraction.project.project_config import ProjectConfig + + config = ProjectConfig(name='beer', title='Beer title', description='Some description') + + cif_text = config.as_cif + + assert not cif_text.startswith('data_') + assert '_project.id beer' in cif_text + assert '_project.title' in cif_text + assert '_project.description' in cif_text + assert '_project.created' in cif_text + assert '_project.last_modified' in cif_text + assert '_rendering.chart_engine' in cif_text + assert '_rendering.table_engine' in cif_text + + +def test_project_save_and_load_keep_project_config_section_format(tmp_path): + from easydiffraction.project.project import Project + + project = Project(name='beer', title='Beer title', description='Some description') + project.rendering.chart_engine = 'asciichartpy' + project.rendering.table_engine = 'rich' + project.save_as(str(tmp_path / 'proj')) + + project_cif = (tmp_path / 'proj' / 'project.cif').read_text() + assert not project_cif.startswith('data_') + assert '_project.id beer' in project_cif + assert '_rendering.chart_engine asciichartpy' in project_cif + assert '_rendering.table_engine rich' in project_cif + + loaded = Project.load(str(tmp_path / 'proj')) + assert loaded.info.name == 'beer' + assert loaded.info.title == 'Beer title' + assert loaded.info.description == 'Some description' + assert isinstance(loaded.info.created, datetime.datetime) + assert isinstance(loaded.info.last_modified, datetime.datetime) + assert loaded.rendering.chart_engine.value == 'asciichartpy' + assert loaded.rendering.table_engine.value == 'rich' From 76f53d5fac6eae82e161e8ea7fcddbd77f7765f2 Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 14:35:47 +0200 Subject: [PATCH 12/15] Standardize project.save_as usage in tutorials --- docs/docs/tutorials/ed-17.py | 2 +- docs/docs/tutorials/ed-20.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/docs/tutorials/ed-17.py b/docs/docs/tutorials/ed-17.py index 2fd4b0f6..24e648c5 100644 --- a/docs/docs/tutorials/ed-17.py +++ b/docs/docs/tutorials/ed-17.py @@ -26,7 +26,7 @@ # results can be written to `analysis/results.csv`. # %% -project.save_as('projects/cosio', temporary=False) +project.save_as(dir_path='projects/cosio', temporary=False) # %% [markdown] # ## Step 2: Define Crystal Structure diff --git a/docs/docs/tutorials/ed-20.py b/docs/docs/tutorials/ed-20.py index 8c6a5b54..1fcc6d24 100644 --- a/docs/docs/tutorials/ed-20.py +++ b/docs/docs/tutorials/ed-20.py @@ -225,7 +225,7 @@ # %% project = Project(name='beer') -project.save_as(dir_path='beer_mcstas') +project.save_as(dir_path='projects/beer_mcstas') # %% [markdown] # #### Add Structures From 365c90f9700fc6e8cedffeca519e5ea6f2e10be0 Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 14:35:57 +0200 Subject: [PATCH 13/15] Ignore projects directory --- .gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitignore b/.gitignore index 28df1a0a..ec2e0d27 100644 --- a/.gitignore +++ b/.gitignore @@ -49,4 +49,5 @@ CMakeLists.txt.user* # Used to fetch tutorials data during their runtime. Need to have '/' at # the beginning to avoid ignoring 'data' module in the src/. /data/ +/projects/ /tmp/ From f8adc7c4dd2afdb353ee49fe55017962cef66ba6 Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 14:44:24 +0200 Subject: [PATCH 14/15] Mark verification steps as complete --- docs/dev/plan_category-owner-sections.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/dev/plan_category-owner-sections.md b/docs/dev/plan_category-owner-sections.md index 9a62c63e..8e6c5014 100644 --- a/docs/dev/plan_category-owner-sections.md +++ b/docs/dev/plan_category-owner-sections.md @@ -30,12 +30,12 @@ Phase 1 — Implementation [x] Phase 1 review gate: present diff for approval. Phase 2 — Verification -[ ] Add per-phase unit tests (see "Tests For Phase X" sections). -[ ] pixi run fix -[ ] pixi run check -[ ] pixi run unit-tests -[ ] pixi run integration-tests -[ ] pixi run script-tests +[x] Add per-phase unit tests (see "Tests For Phase X" sections). +[x] pixi run fix +[x] pixi run check +[x] pixi run unit-tests +[x] pixi run integration-tests +[x] pixi run script-tests ``` ## Commit Discipline From df93079491d72cdbb9061aa67d53a8a867458875 Mon Sep 17 00:00:00 2001 From: Andrew Sazonov Date: Sun, 17 May 2026 14:52:20 +0200 Subject: [PATCH 15/15] Accept CategoryOwner ADR and remove migration plan --- .../adr_category-owner-sections.md | 298 ------ docs/dev/ADRs/adr_category-owner-sections.md | 206 ++++ docs/dev/plan_category-owner-sections.md | 965 ------------------ 3 files changed, 206 insertions(+), 1263 deletions(-) delete mode 100644 docs/dev/ADR-suggestions/adr_category-owner-sections.md create mode 100644 docs/dev/ADRs/adr_category-owner-sections.md delete mode 100644 docs/dev/plan_category-owner-sections.md diff --git a/docs/dev/ADR-suggestions/adr_category-owner-sections.md b/docs/dev/ADR-suggestions/adr_category-owner-sections.md deleted file mode 100644 index bf13940d..00000000 --- a/docs/dev/ADR-suggestions/adr_category-owner-sections.md +++ /dev/null @@ -1,298 +0,0 @@ -# ADR: Category Owners and Real Datablocks - -**Status:** Proposed -**Date:** 2026-05-17 - -## Context - -The current architecture has two real datablock families: - -- structures -- experiments - -These are real datablocks because each instance maps to a CIF -`data_` block and has a datablock entry name. This is reflected in -`DatablockItem`, `DatablockCollection`, and `datablock_item_to_cif()`. - -`Analysis` and project-level configuration are different. They own -CIF-like categories, but they are singleton project sections rather than -collections of independently named data blocks. They should not need a -datablock ID, and they should not serialize as `data_analysis` or -`data_project`. - -The problem is that `DatablockItem` currently combines two -responsibilities: - -1. It owns and updates flat categories. -2. It represents a real CIF data block with a `data_` header. - -This makes it tempting to make `Analysis` inherit from `DatablockItem` -only to reuse category discovery, parameter enumeration, update -orchestration, and CIF category serialization. That would improve code -reuse, but it would weaken the meaning of "datablock" in the model. - -The open issue "Make `Analysis` a `DatablockItem`" already identifies -the design fork: either make `Analysis` inherit from `DatablockItem`, or -extract a shared category-update protocol. - -A detailed migration plan exists in: - -```text -docs/dev/plan_category-owner-sections.md -``` - -This ADR records the intended architectural decision behind that plan. - -## Decision - -Introduce a new core abstraction named `CategoryOwner`. - -`CategoryOwner` represents an object that owns flat CIF-like categories. -It provides shared behavior for: - -- category discovery -- category sorting by `_update_priority` -- parameter aggregation -- category update orchestration -- dirty-flag tracking -- category help display -- category-body CIF serialization without a `data_` header - -`DatablockItem` will inherit from `CategoryOwner` and add only the -behavior specific to real CIF data blocks: - -- `unique_name` from `datablock_entry_name` -- `data_` header serialization -- participation in `DatablockCollection` - -The target hierarchy is: - -```text -GuardedBase -|-- CategoryItem -|-- CollectionBase -| |-- CategoryCollection -| `-- DatablockCollection -`-- CategoryOwner - |-- DatablockItem - | |-- Structure - | `-- ExperimentBase - |-- Analysis - `-- ProjectConfig # optional follow-up -``` - -## Detailed Rules - -### 1. A real datablock must emit a CIF `data_` header - -Only objects that serialize as independent CIF data blocks should -inherit datablock-specific behavior. - -Current real datablocks: - -- `Structure` -- `ExperimentBase` subclasses - -These should continue to emit: - -```text -data_ -data_ -``` - -### 2. `Analysis` is a category-owning singleton section - -`Analysis` should inherit from `CategoryOwner`, not `DatablockItem`. - -It owns categories such as: - -- `fitting` -- `aliases` -- `constraints` -- `joint_fit` -- `sequential_fit` -- `sequential_fit_extract` - -It should reuse shared category discovery and parameter aggregation, but -its saved CIF should remain a section body. It must not emit -`data_analysis`. - -### 3. Project configuration may become a category-owning singleton section - -Project-level configuration should eventually follow the same model. - -The likely long-term shape is: - -```text -Project -`-- ProjectConfig(CategoryOwner) - |-- ProjectInfo or ProjectMetadata - `-- Rendering -``` - -This should be treated as a follow-up after the `Analysis` migration is -stable, because project save/load compatibility is a separate risk. - -### 4. CIF serialization should be split by responsibility - -Add a serializer for category-owner bodies: - -```python -category_owner_to_cif(owner) -``` - -This function serializes categories but does not add a `data_` header. - -Keep `datablock_item_to_cif(datablock)` for real datablocks. It should -compose: - -1. `data_` -2. `category_owner_to_cif(datablock)` - -### 5. Active analysis categories remain an owner-level policy - -`Analysis` has mode-specific sibling categories. They remain direct -children of `Analysis`, not nested under `fitting`. - -`Analysis` should expose a hook such as: - -```python -def _serializable_categories(self) -> list: - ... -``` - -This hook controls which categories are written for the current fitting -mode: - -- shared categories always serialize -- `joint_fit` serializes in joint mode -- `sequential_fit` and `sequential_fit_extract` serialize in sequential - mode -- inactive mode-specific categories remain accessible but are not - serialized - -This preserves the active-sibling selector model accepted in -`docs/dev/ADRs/adr_fit-mode-categories.md`. - -## Consequences - -### Positive - -- The term "datablock" remains precise. -- `Analysis` can reuse standard category discovery and parameter - enumeration without becoming a fake data block. -- CIF serialization becomes clearer because category-body rendering is - separated from `data_` header rendering. -- Future project-level category sections can reuse the same base class. -- Dirty-flag behavior can be generalized from `DatablockItem` to - `CategoryOwner`. - -### Trade-offs - -- A new core abstraction must be introduced and documented. -- `DatablockItem` must be refactored without changing structure and - experiment behavior. -- `Analysis` migration touches both runtime behavior and CIF - serialization, so it needs characterization tests. -- Project configuration cleanup should be delayed to avoid mixing two - save/load migrations. - -### Compatibility Requirements - -The migration must preserve: - -- `structure.as_cif` starts with `data_` -- `experiment.as_cif` starts with `data_` -- `analysis.as_cif` does not start with `data_` -- project save/load layout remains compatible -- `project.parameters` remains fit-focused and does not include analysis - configuration parameters unless a later ADR changes that contract - -## Alternatives Considered - -### Make `Analysis` inherit from `DatablockItem` - -Rejected. - -This would be the smallest code change, but it would make "datablock" -mean both real CIF data blocks and singleton project sections. It would -also encourage fake identities such as -`datablock_entry_name = "analysis"`. - -### Add `emit_data_header = False` to `DatablockItem` - -Rejected. - -This keeps inheritance reuse but encodes two different concepts in one -class. It also makes every future datablock-related method check whether -the object is a real datablock. - -### Keep `Analysis` fully ad hoc - -Rejected. - -This preserves current behavior but leaves duplicated logic for category -discovery, category updates, parameter enumeration, and CIF section -serialization. - -### Make `Project` itself a `CategoryOwner` - -Deferred. - -`Project` is a facade that owns collections, display helpers, summary -helpers, analysis, and file I/O. Making the facade itself a category -owner would mix orchestration with category-section behavior. A smaller -`ProjectConfig(CategoryOwner)` is a better follow-up. - -## Implementation Plan - -Follow: - -```text -docs/dev/plan_category-owner-sections.md -``` - -The plan should be implemented in small phases: - -1. Add baseline tests. -2. Add `CategoryOwner`. -3. Make `DatablockItem` inherit `CategoryOwner`. -4. Split category-owner CIF body serialization from datablock - serialization. -5. Move `Analysis` onto `CategoryOwner`. -6. Generalize dirty-flag lookup from `DatablockItem` to `CategoryOwner`. -7. Optionally introduce `ProjectConfig`. -8. Update architecture and issue tracking. - -## Post-Implementation ADR Update - -This ADR must be updated after the migration plan is implemented. - -When implementation is complete: - -1. Change status from `Proposed` to `Accepted and implemented`. -2. Update the date if the project convention requires the implementation - date. -3. Replace tentative wording such as "should" and "target hierarchy" - with the actual final design. -4. Record any deviations from the migration plan. -5. Link to the implementation PR or commit if available. -6. Move this file from `docs/dev/ADR-suggestions/` to `docs/dev/ADRs/` - if that is the repository convention for accepted decisions. -7. Update `docs/dev/architecture.md`. -8. Update or close the related issue in `docs/dev/Issues/issues_open.md` - (move to `docs/dev/Issues/issues_closed.md` on full resolution). - -## Acceptance Criteria - -This ADR is satisfied when: - -- `CategoryOwner` exists and is documented. -- `DatablockItem` inherits from `CategoryOwner`. -- `Analysis` inherits from `CategoryOwner`, not `DatablockItem`. -- `Analysis` uses shared category discovery and parameter enumeration. -- real datablocks still emit `data_` headers. -- singleton sections do not emit fake `data_` headers. -- dirty-flag propagation works for all category owners. -- architecture documentation distinguishes real datablocks from - category-owning sections. diff --git a/docs/dev/ADRs/adr_category-owner-sections.md b/docs/dev/ADRs/adr_category-owner-sections.md new file mode 100644 index 00000000..46f9b3db --- /dev/null +++ b/docs/dev/ADRs/adr_category-owner-sections.md @@ -0,0 +1,206 @@ +# ADR: Category Owners and Real Datablocks + +## Status + +Accepted and implemented. + +## Date + +2026-05-17 + +## Context + +The library has two different kinds of objects that expose CIF-like +categories: + +- real datablocks such as structures and experiments +- singleton project sections such as analysis and project configuration + +Real datablocks map to CIF `data_` blocks and therefore need a +datablock identity plus a `data_` header. Singleton sections do not. + +Before this change, `DatablockItem` mixed two responsibilities: + +1. owning and updating flat categories +2. representing a real CIF data block with a `data_` header + +That made it tempting to move `Analysis` onto `DatablockItem` just to +reuse category discovery, parameter enumeration, dirty tracking, and CIF +serialization. Doing so would have weakened the meaning of "datablock" +in the architecture and encouraged fake identities such as +`datablock_entry_name = "analysis"`. + +## Decision + +Introduce `CategoryOwner` as the shared abstraction for objects that own +flat CIF-like categories. + +### 1. Real datablocks remain `DatablockItem`-based + +`DatablockItem` now inherits from `CategoryOwner` and keeps only the +behavior specific to real CIF data blocks: + +- `data_` header serialization +- datablock identity via `datablock_entry_name` +- participation in `DatablockCollection` + +The real datablock families remain: + +- `Structure` +- `ExperimentBase` subclasses + +They continue to serialize as independent CIF data blocks with +`data_` headers. + +### 2. `Analysis` is a category-owning singleton section + +`Analysis` inherits from `CategoryOwner`, not `DatablockItem`. + +It reuses shared behavior for: + +- category discovery +- parameter aggregation +- category update orchestration +- dirty-flag tracking +- help display +- headerless CIF body serialization + +`Analysis` remains a singleton section without a fake `data_` header. It +uses an owner-level `_serializable_categories()` policy so that only the +active sibling categories are written for the current fitting mode. +Inactive mode-specific categories remain accessible but are not +serialized. + +### 3. Project configuration is a category-owning singleton section + +Project-level configuration follows the same pattern via a private +`ProjectConfig(CategoryOwner)` object. + +Its current children are: + +- `ProjectInfo` +- `Rendering` + +The public API stays flat and user-facing: + +- `project.info` +- `project.rendering` + +Saved `project.cif` remains a section file without a `data_` header. It +serializes the `_project.*` metadata category and the `_rendering.*` +configuration category without pretending that the project config is a +real datablock. + +### 4. CIF serialization is split by responsibility + +Serialization is separated into two layers: + +- `category_owner_to_cif(owner)` renders category bodies without a + `data_` header +- `datablock_item_to_cif(datablock)` renders the `data_` header and + then the category-owner body + +This keeps the meaning of `DatablockItem` precise while letting +singleton sections reuse the same category serialization logic. + +### 5. Dirty-flag propagation is generalized to `CategoryOwner` + +Descriptor changes now mark the nearest `CategoryOwner` ancestor dirty +instead of depending on `DatablockItem` specifically. Structures, +experiments, analysis, and project configuration now share the same +owner-level dirty/update contract. + +## Resulting Hierarchy + +```text +GuardedBase +|-- CategoryItem +|-- CollectionBase +| |-- CategoryCollection +| `-- DatablockCollection +`-- CategoryOwner + |-- DatablockItem + | |-- Structure + | `-- ExperimentBase + |-- Analysis + `-- ProjectConfig +``` + +## Consequences + +### Positive + +- The term "datablock" remains semantically precise. +- `Analysis` and project configuration reuse the standard category-owner + behavior without becoming fake data blocks. +- CIF serialization is clearer because category-body rendering is + separated from `data_` header rendering. +- Dirty-flag handling is consistent across all category owners. +- Project-level singleton sections now follow the same architectural + pattern as analysis. + +### Trade-offs + +- The core model gains a new abstraction that must be understood and + documented. +- Owner-level serialization policy now lives in explicit hooks such as + `_serializable_categories()` instead of falling out of the raw object + layout. + +### Compatibility Outcomes + +The implemented design preserves these contracts: + +- `structure.as_cif` starts with `data_` +- `experiment.as_cif` starts with `data_` +- `analysis.as_cif` does not start with `data_` +- `project.cif` does not emit `data_project` +- `project.parameters` remains fit-focused and does not include analysis + configuration parameters +- saved project layout remains compatible + +## Alternatives Considered + +### Make `Analysis` inherit from `DatablockItem` + +Rejected. + +This would have been the smallest code change, but it would make +"datablock" mean both real CIF data blocks and singleton project +sections. + +### Add `emit_data_header = False` to `DatablockItem` + +Rejected. + +This would keep reuse through inheritance but encode two different +concepts in one class and force datablock behavior to branch on whether +the object is "real enough" to emit a header. + +### Keep `Analysis` fully ad hoc + +Rejected. + +That would preserve current behavior but keep duplicated logic for +category discovery, category updates, parameter enumeration, and CIF +section serialization. + +### Make `Project` itself a `CategoryOwner` + +Rejected. + +`Project` is a top-level facade that coordinates structures, +experiments, analysis, display, summary, and file I/O. A smaller private +`ProjectConfig(CategoryOwner)` keeps the category-owning concern local +to the singleton project-section surface instead of mixing it into the +facade itself. + +## Verification + +This decision is fully implemented and was verified with: + +- `pixi run fix` +- `pixi run check` +- `pixi run unit-tests` +- `pixi run integration-tests` +- `pixi run script-tests` diff --git a/docs/dev/plan_category-owner-sections.md b/docs/dev/plan_category-owner-sections.md deleted file mode 100644 index 8e6c5014..00000000 --- a/docs/dev/plan_category-owner-sections.md +++ /dev/null @@ -1,965 +0,0 @@ -# Category Owner Migration Plan - -## Status - -Branch: `feature/category-owner-sections` - -Two-phase workflow (see `.github/copilot-instructions.md`): - -- Phase 1 — Implementation. Code, docs, and architecture updates only. - Phase 0 baseline characterization tests below are an explicit - exception (they describe pre-existing behavior so the refactor can be - verified mechanically). Do not add new feature tests during Phase 1. -- Phase 2 — Verification. Add the per-phase tests listed below, then run - the verification commands in the "Phase 2 Verification" section. - -Stop after Phase 1 and request review before starting Phase 2. - -Status checklist (mark `[x]` as completed): - -```text -Phase 1 — Implementation -[x] Phase 0: Add baseline characterization tests. -[x] Phase 1: Add CategoryOwner in core/category_owner.py. -[x] Phase 2: Make DatablockItem inherit CategoryOwner. -[x] Phase 3: Split CIF body serialization (category_owner_to_cif). -[x] Phase 4: Move Analysis onto CategoryOwner. -[x] Phase 5: Update dirty-flag lookup to CategoryOwner. -[x] Phase 6: (Optional) ProjectConfig cleanup. -[x] Phase 7: Update architecture.md and Issues/issues_open.md. -[x] Phase 1 review gate: present diff for approval. - -Phase 2 — Verification -[x] Add per-phase unit tests (see "Tests For Phase X" sections). -[x] pixi run fix -[x] pixi run check -[x] pixi run unit-tests -[x] pixi run integration-tests -[x] pixi run script-tests -``` - -## Commit Discipline - -When an AI agent follows this plan, every completed Phase 1 -implementation step listed in the status checklist must be staged with -explicit paths and committed locally before moving to the next -implementation step or to the Phase 1 review gate. Follow the rules in -the **Commits** section of `.github/copilot-instructions.md`. - -- One commit per step. Atomic and single-purpose. -- Stage explicit paths only — do not `git add .`. -- Suggested commit messages (≤72 chars, imperative mood, no type - prefix): - - `Add baseline category-owner characterization tests` - - `Add CategoryOwner base class` - - `Make DatablockItem inherit CategoryOwner` - - `Add category_owner_to_cif and reuse it in datablock serializer` - - `Make Analysis inherit CategoryOwner` - - `Generalize dirty-flag lookup to CategoryOwner` - - `Introduce ProjectConfig category owner` (optional) - - `Document CategoryOwner in architecture.md` -- Do not stage generated artifacts produced by integration/script/ - notebook tests unless explicitly asked. - -## Purpose - -This plan explains how to make `Analysis` and project-level -configuration reuse the same category-management behavior as real -datablocks without making them real datablocks. - -The current code has two real datablock types: - -- `Structure` -- `Experiment` - -Those are real datablocks because they represent CIF `data_` blocks -and have a user-defined block ID. - -`Analysis` and project-level configuration are different. They own -CIF-like categories, but they are singleton project sections. They -should not require a datablock ID and should not serialize as -`data_analysis` or `data_project`. - -The target design is: - -```text -GuardedBase -|-- CategoryItem -|-- CollectionBase -| |-- CategoryCollection -| `-- DatablockCollection -`-- CategoryOwner - |-- DatablockItem - | |-- Structure - | `-- ExperimentBase - |-- Analysis - `-- ProjectConfig # optional follow-up -``` - -`CategoryOwner` contains shared behavior for objects that own flat -categories. `DatablockItem` keeps the additional behavior that only real -CIF data blocks need. - -## Important Definitions - -Use these terms consistently during the migration. - -`DatablockItem` : A real CIF data block. It serializes with a -`data_` header. It has a `datablock_entry_name`. Examples: -`Structure`, `ExperimentBase`. - -`CategoryOwner` : An object that owns flat sibling categories and can -update, enumerate, and serialize those categories. It does not -necessarily have a `data_` header. Examples after migration: -`DatablockItem`, `Analysis`, optional `ProjectConfig`. - -`CategoryItem` : A single category row. Example: `Cell`, `SpaceGroup`, -`Fitting`, `Rendering`. - -`CategoryCollection` : A loop-style category collection. Example: -`AtomSites`, `Aliases`, `JointFitCollection`. - -## Non-Negotiable Rules - -Follow these rules throughout the migration: - -1. Do not make `Analysis` inherit directly from `DatablockItem`. -2. Do not make `Project` inherit from `DatablockItem`. -3. Do not emit `data_analysis` from `Analysis.as_cif`. -4. Do not emit `data_project` in the saved `project.cif` file. -5. Do not rename CIF tags during this migration. -6. Do not change public access paths unless a phase explicitly says so. -7. Keep `project.parameters` limited to fit-relevant structure and - experiment parameters unless a separate design decision changes that - later. -8. Keep inactive analysis categories accessible unless the fit-mode - policy is intentionally changed in a separate task. - -## Recommended Branching Strategy - -Use several small pull requests or commits. Each phase should be -independently reviewable. - -Suggested split: - -1. Characterization tests. -2. Add `CategoryOwner`. -3. Move `DatablockItem` onto `CategoryOwner`. -4. Split CIF body serialization. -5. Move `Analysis` onto `CategoryOwner`. -6. Update dirty-flag lookup. -7. Optional project-config cleanup. -8. Documentation cleanup. - -Do not combine phases 4, 5, and 6 into one large change. If something -breaks, small phases make the source of the break obvious. - -## Phase 0: Baseline Safety Tests - -Before changing production code, add or confirm tests that describe -current behavior. - -### Files To Read First - -Read these files before editing: - -- `src/easydiffraction/core/datablock.py` -- `src/easydiffraction/core/category.py` -- `src/easydiffraction/core/collection.py` -- `src/easydiffraction/core/identity.py` -- `src/easydiffraction/core/variable.py` -- `src/easydiffraction/analysis/analysis.py` -- `src/easydiffraction/project/project.py` -- `src/easydiffraction/io/cif/serialize.py` -- `docs/dev/architecture.md` -- `docs/dev/Issues/issues_open.md` - -### Tests To Add Or Confirm - -Add focused tests before refactoring. Good locations are: - -- `tests/unit/easydiffraction/core/` -- `tests/unit/easydiffraction/analysis/` -- `tests/unit/easydiffraction/io/cif/` - -Test these behaviors: - -1. A structure CIF starts with `data_`. -2. An experiment CIF starts with `data_`. -3. `Analysis.as_cif` does not start with `data_`. -4. `Analysis.as_cif` includes `_fitting.mode_type`. -5. `Analysis.as_cif` includes `fitting`, `aliases`, and `constraints` - sections when present. -6. In joint mode, `Analysis.as_cif` includes `joint_fit`. -7. In sequential mode, `Analysis.as_cif` includes `sequential_fit` and - `sequential_fit_extract`. -8. In single mode, inactive mode-specific sections are not serialized. -9. Existing project save/load behavior still reads: - - `project.cif` - - `structures/*.cif` - - `experiments/*.cif` - - `analysis/analysis.cif` - -### Suggested Commands - -Run focused tests first: - -```shell -pixi run unit-tests tests/unit/easydiffraction/core -pixi run unit-tests tests/unit/easydiffraction/analysis -pixi run unit-tests tests/unit/easydiffraction/io/cif -``` - -If these pass, run a broader unit subset: - -```shell -pixi run unit-tests tests/unit/easydiffraction/datablocks -``` - -## Phase 1: Add `CategoryOwner` - -Create: - -```text -src/easydiffraction/core/category_owner.py -``` - -### Responsibility - -`CategoryOwner` should own the behavior currently shared by any object -that has flat categories: - -- category discovery -- category sorting by `_update_priority` -- parameter aggregation -- category update orchestration -- dirty flag -- category table in `help()` -- CIF body serialization support, without a `data_` header - -### Initial Implementation Shape - -Start with this shape. Adjust imports as needed. - -```python -from __future__ import annotations - -from easydiffraction.core.category import CategoryCollection -from easydiffraction.core.category import CategoryItem -from easydiffraction.core.guard import GuardedBase - - -class CategoryOwner(GuardedBase): - """Base class for objects that own flat CIF-like categories.""" - - def __init__(self) -> None: - super().__init__() - self._need_categories_update = True - - @property - def categories(self) -> list: - """All category objects owned by this object, sorted by priority.""" - cats = [ - v - for v in vars(self).values() - if isinstance(v, (CategoryItem, CategoryCollection)) - ] - return sorted(cats, key=lambda c: type(c)._update_priority) - - def _serializable_categories(self) -> list: - """Categories that should be serialized for this owner.""" - return self.categories - - @property - def parameters(self) -> list: - """All parameters from all owned categories.""" - params = [] - for category in self.categories: - params.extend(category.parameters) - return params - - def _update_categories( - self, - *, - called_by_minimizer: bool = False, - ) -> None: - """Run update hooks on all owned categories.""" - if not called_by_minimizer and not self._need_categories_update: - return - - for category in self.categories: - category._update(called_by_minimizer=called_by_minimizer) - - self._need_categories_update = False -``` - -### Help Method - -Move the category-summary part of `DatablockItem.help()` into -`CategoryOwner.help()`. - -Keep it simple: - -1. Call `super().help()`. -2. Render a table of category code, type name, and parameter count. - -### Tests For Phase 1 - -Create fake category owner tests. Do not use `Structure` or `Analysis` -yet. - -Test: - -1. `CategoryOwner.categories` finds direct `CategoryItem` attributes. -2. `CategoryOwner.categories` finds direct `CategoryCollection` - attributes. -3. Categories are sorted by `_update_priority`. -4. `CategoryOwner.parameters` aggregates parameters from categories. -5. `_update_categories()` calls category `_update()` methods. -6. `_update_categories()` honors `_need_categories_update`. -7. `_serializable_categories()` returns `categories` by default. - -## Phase 2: Make `DatablockItem` Inherit `CategoryOwner` - -Edit: - -```text -src/easydiffraction/core/datablock.py -``` - -### What To Change - -Change: - -```python -class DatablockItem(GuardedBase): -``` - -to: - -```python -class DatablockItem(CategoryOwner): -``` - -Import `CategoryOwner` from the new module. - -Remove from `DatablockItem` any code now provided by `CategoryOwner`: - -- `_need_categories_update` initialization -- generic `_update_categories()` -- `categories` -- `parameters` -- category-summary part of `help()`, if moved - -Keep these in `DatablockItem`: - -- `unique_name` -- `as_cif` -- `_cif_for_display` -- datablock-specific string representations - -### Expected Result - -This phase should not change public behavior. - -These should still work: - -```python -structure.as_cif -experiment.as_cif -structure.categories -experiment.categories -structure.parameters -experiment.parameters -``` - -### Tests For Phase 2 - -Run: - -```shell -pixi run unit-tests tests/unit/easydiffraction/core -pixi run unit-tests tests/unit/easydiffraction/datablocks -pixi run unit-tests tests/unit/easydiffraction/io/cif -``` - -If failures happen, check parent linkage first. Most failures in this -phase are likely caused by categories not having the expected `_parent`. - -## Phase 3: Split CIF Body Serialization From Datablock Serialization - -Edit: - -```text -src/easydiffraction/io/cif/serialize.py -``` - -### Problem - -`datablock_item_to_cif()` currently does two jobs: - -1. Adds the `data_` header. -2. Serializes category content. - -Only real datablocks need job 1. `Analysis` and project config need -job 2. - -### Add `category_owner_to_cif()` - -Add a helper like this: - -```python -def category_owner_to_cif( - owner: object, - max_loop_display: int | None = None, -) -> str: - """Render a CategoryOwner-like object's categories without a data_ header.""" - from easydiffraction.core.category import CategoryCollection - from easydiffraction.core.category import CategoryItem - - categories_getter = getattr(owner, "_serializable_categories", None) - if callable(categories_getter): - categories = categories_getter() - else: - categories = [ - v - for v in vars(owner).values() - if isinstance(v, (CategoryItem, CategoryCollection)) - ] - - item_parts = [ - category.as_cif - for category in categories - if isinstance(category, CategoryItem) and category.as_cif - ] - - collection_parts = [ - category_collection_to_cif(category, max_display=max_loop_display) - for category in categories - if isinstance(category, CategoryCollection) - ] - - return "\n\n".join([part for part in item_parts + collection_parts if part]) -``` - -Keep the current item-first, collection-second ordering unless a -separate test and design decision changes it. - -### Update `datablock_item_to_cif()` - -After adding `category_owner_to_cif()`, reduce `datablock_item_to_cif()` -to: - -```python -header = f"data_{datablock._identity.datablock_entry_name}" -body = category_owner_to_cif(datablock, max_loop_display=max_loop_display) -if not body: - return header -return "\n\n".join([header, body]) -``` - -### Tests For Phase 3 - -Add or update tests for: - -1. `category_owner_to_cif()` does not emit `data_`. -2. `datablock_item_to_cif()` still emits `data_`. -3. Empty category owners serialize to an empty string. -4. Empty datablocks serialize to only the header. -5. Loop truncation still works for display. - -Run: - -```shell -pixi run unit-tests tests/unit/easydiffraction/io/cif -pixi run unit-tests tests/unit/easydiffraction/core -pixi run unit-tests tests/unit/easydiffraction/datablocks -``` - -## Phase 4: Move `Analysis` Onto `CategoryOwner` - -Edit: - -```text -src/easydiffraction/analysis/analysis.py -``` - -### Main Change - -Change: - -```python -class Analysis: -``` - -to: - -```python -class Analysis(CategoryOwner): -``` - -Call `super().__init__()` at the start of `Analysis.__init__()`. - -### Parent Linkage - -Because `Analysis` will now be a `GuardedBase` subclass, private -assignment of `GuardedBase` children should normally set `_parent` -automatically. - -Still verify these categories have `_parent is analysis`: - -- `analysis.aliases` -- `analysis.constraints` -- `analysis.fitting` -- `analysis.joint_fit` -- `analysis.sequential_fit` -- `analysis.sequential_fit_extract` - -Some are currently assigned manually and some may not be. Make the -result consistent. - -### Keep Existing Public API - -Do not remove these access paths: - -```python -project.analysis.fitting -project.analysis.aliases -project.analysis.constraints -project.analysis.joint_fit -project.analysis.sequential_fit -project.analysis.sequential_fit_extract -project.analysis.fitting_mode_type -``` - -If direct public attributes currently exist without properties, convert -them carefully. Less experienced agents should prefer a -compatibility-preserving change: - -1. Keep the public access path. -2. Add properties only when needed. -3. Avoid large style cleanup in the same phase. - -### Add `_serializable_categories()` - -Add this method to `Analysis`: - -```python -def _serializable_categories(self) -> list: - """Analysis categories that should be written for the active fit mode.""" - categories = [ - self.fitting, - self.aliases, - self.constraints, - ] - - if self._fitting_mode_type is FitModeEnum.JOINT: - categories.append(self.joint_fit) - elif self._fitting_mode_type is FitModeEnum.SEQUENTIAL: - categories.extend([ - self.sequential_fit, - self.sequential_fit_extract, - ]) - - return categories -``` - -This preserves the current rule: - -- shared analysis categories always serialize -- joint-only category serializes only in joint mode -- sequential categories serialize only in sequential mode -- inactive categories remain accessible but are not written - -### Update `Analysis.as_cif` - -Keep the leading mode line: - -```text -_fitting.mode_type -``` - -Then append `category_owner_to_cif(self)`. - -Example implementation shape: - -```python -@property -def as_cif(self) -> str: - self._update_categories() - return analysis_to_cif(self) -``` - -Then update `analysis_to_cif()` in `serialize.py` to use -`category_owner_to_cif(analysis)` internally. - -Recommended serializer shape: - -```python -def analysis_to_cif(analysis: object) -> str: - parts = [ - f"_fitting.mode_type {format_value(analysis.fitting_mode_type)}", - ] - body = category_owner_to_cif(analysis) - if body: - parts.append(body) - return "\n\n".join(parts) -``` - -### Keep Or Adjust `_update_categories()` - -`Analysis._update_categories()` currently applies constraints. That is -analysis-specific and must not be lost. - -Safest first version: - -```python -def _update_categories( - self, - *, - called_by_minimizer: bool = False, -) -> None: - super()._update_categories(called_by_minimizer=called_by_minimizer) - - if self.constraints.enabled and self.constraints._items: - self.constraints_handler.set_aliases(self.aliases) - self.constraints_handler.set_constraints(self.constraints) - self.constraints_handler.apply() -``` - -If this changes behavior, use the existing constraint logic and leave a -short comment explaining why shared category updates are intentionally -skipped or ordered differently. - -### Tests For Phase 4 - -Add tests for: - -1. `isinstance(analysis, CategoryOwner)`. -2. `analysis.categories` includes analysis categories. -3. `analysis.parameters` includes fitting and analysis config - descriptors. -4. Parent links are set on all analysis categories. -5. `analysis.as_cif` still does not start with `data_`. -6. `analysis.as_cif` still includes `_fitting.mode_type`. -7. Mode-specific serialization behavior is unchanged. -8. `analysis.help()` still works and respects `_help_filter()`. - -Run: - -```shell -pixi run unit-tests tests/unit/easydiffraction/analysis -pixi run unit-tests tests/unit/easydiffraction/io/cif -pixi run unit-tests tests/unit/easydiffraction/project -``` - -## Phase 5: Update Dirty-Flag Lookup - -Edit: - -```text -src/easydiffraction/core/variable.py -``` - -### Current Behavior - -Descriptors currently find a `DatablockItem` ancestor when marking an -owner dirty after value changes. - -That is too narrow after this migration. `Analysis` will also be a -category owner. - -### Target Behavior - -Descriptors should mark the nearest `CategoryOwner` ancestor dirty. - -Change helper naming carefully. For example: - -```python -def _category_owner(self) -> object | None: - from easydiffraction.core.category_owner import CategoryOwner - - return self._parent_of_type(CategoryOwner) -``` - -Then update value setters to use this owner: - -```python -parent_owner = self._category_owner() -if parent_owner is not None: - parent_owner._need_categories_update = True -``` - -### Compatibility Check - -This must still mark structures and experiments dirty because -`DatablockItem` now inherits `CategoryOwner`. - -### Tests For Phase 5 - -Add tests for: - -1. Changing a structure parameter marks the structure dirty. -2. Changing an experiment parameter marks the experiment dirty. -3. Changing an analysis category descriptor marks analysis dirty. -4. `_set_value_from_minimizer()` still marks the owner dirty. - -Run: - -```shell -pixi run unit-tests tests/unit/easydiffraction/core -pixi run unit-tests tests/unit/easydiffraction/analysis -pixi run unit-tests tests/unit/easydiffraction/datablocks -``` - -## Phase 6: Optional Project Config Cleanup - -This phase is optional and should be a separate PR. Do not do it until -`Analysis` is stable. - -### Goal - -Project-level configuration also has category-like behavior. Today: - -- `ProjectInfo` is a special metadata object. -- `Rendering` is already a `CategoryItem`. -- `project_config_to_cif()` manually combines them. - -The cleaner long-term shape is: - -```text -Project -`-- ProjectConfig(CategoryOwner) - |-- ProjectInfo - `-- Rendering -``` - -This full version is now implemented. `ProjectInfo` is a real -`CategoryItem` under `project/categories/info/`, `ProjectConfig` -serializes via shared `CategoryOwner` helpers, and the public -`project.info` / `project.rendering` access paths remain unchanged. - -### Low-Risk Version - -If converting `ProjectInfo` into a `CategoryItem` is too disruptive, do -not do it immediately. - -Instead: - -1. Add `ProjectConfig(CategoryOwner)`. -2. Put only `rendering` inside it at first. -3. Keep `ProjectInfo` special. -4. Keep `project.info` and `project.rendering` public access unchanged. -5. Use shared category-owner serialization only for `rendering`. - -### Full Version - -Convert project metadata into a real category item: - -```text -ProjectMetadata(CategoryItem) -``` - -It should own descriptors for: - -- project id -- title -- description -- created timestamp -- last modified timestamp - -This is more work because timestamps and path handling need care. The -implemented version keeps `path` as runtime-only state while storing the -project id, title, description, created timestamp, and last-modified -timestamp in the `ProjectInfo` category. - -### Save/Load Rule - -Even after this cleanup, saved `project.cif` should remain a section -file without an explicit `data_` header. The loader currently wraps -project config with `data_project` before parsing, and that can remain -an implementation detail. - -## Phase 7: Documentation Updates - -Update: - -```text -docs/dev/architecture.md -docs/dev/Issues/issues_open.md -``` - -### Architecture Updates - -In `architecture.md`, update the object hierarchy to include -`CategoryOwner`. - -Replace the current hierarchy with something like: - -```text -GuardedBase -|-- CategoryItem -|-- CollectionBase -| |-- CategoryCollection -| `-- DatablockCollection -`-- CategoryOwner - `-- DatablockItem -``` - -Add this design rule: - -```text -A real datablock is a CategoryOwner that serializes as a CIF data_ block. -Not every CategoryOwner is a real datablock. -``` - -Update the flat category structure section from: - -```text -Owner (DatablockItem / Analysis) -``` - -to: - -```text -Owner (CategoryOwner) -``` - -Explain that: - -- `Structure` and `ExperimentBase` are real datablocks. -- `Analysis` is a singleton category-owning section. -- Project configuration may become a singleton category-owning section. - -### Issues File Updates - -When the migration is complete, update issue 5 in -`docs/dev/Issues/issues_open.md`. - -If fully complete, remove the issue and add a note to closed issues if -that is the project convention. - -If only `CategoryOwner` is introduced but `Analysis` is not migrated -yet, change the issue text to reflect the remaining work. - -## Final Acceptance Criteria - -The migration is done when all of these are true: - -1. `CategoryOwner` exists and is tested. -2. `DatablockItem` inherits from `CategoryOwner`. -3. `Structure.as_cif` still emits `data_`. -4. `Experiment.as_cif` still emits `data_`. -5. `Analysis` inherits from `CategoryOwner`. -6. `Analysis.as_cif` does not emit a `data_` header. -7. `Analysis` uses shared category discovery for `categories`. -8. `Analysis` uses shared parameter enumeration for `parameters`. -9. Inactive analysis categories remain accessible. -10. Inactive analysis categories are not serialized. -11. Dirty-flag marking works for structures, experiments, and analysis. -12. Project save/load format remains compatible. -13. Documentation distinguishes real datablocks from category-owning - sections. - -## Common Mistakes To Avoid - -### Mistake: Making `Analysis` A Fake Datablock - -Do not solve this by setting: - -```python -self._identity.datablock_entry_name = lambda: "analysis" -``` - -That makes parameter identities and CIF output imply that analysis is a -real datablock. It is not. - -### Mistake: Adding A Boolean Flag To `DatablockItem` - -Avoid: - -```python -class Analysis(DatablockItem): - emit_data_header = False -``` - -This works mechanically but weakens the model. The class name -`DatablockItem` should mean real datablock. - -### Mistake: Changing Fit Parameter Semantics - -Do not change `project.parameters` to include analysis parameters during -this migration. Analysis parameters are configuration parameters, not -model parameters for fitting. - -### Mistake: Refactoring Public API While Moving Base Classes - -Do not rename `fitting_mode_type`, `joint_fit`, `sequential_fit`, -`rendering`, or `info` during this migration. Naming cleanup can happen -later. - -### Mistake: Assuming `vars(owner)` Order Is A Design Contract - -If serialization order matters, write tests for it. Prefer explicit -`_serializable_categories()` hooks for owners that need a specific -active subset. - -## Minimal Agent Checklist - -The authoritative checklist lives in the **Status** section at the top -of this document. Keep it updated as work progresses. - -## Phase 2 Verification - -Run these in order after Phase 1 is reviewed and approved. - -Per-area focused unit tests first: - -```shell -pixi run unit-tests tests/unit/easydiffraction/core -pixi run unit-tests tests/unit/easydiffraction/analysis -pixi run unit-tests tests/unit/easydiffraction/io/cif -pixi run unit-tests tests/unit/easydiffraction/datablocks -pixi run unit-tests tests/unit/easydiffraction/project -``` - -Then the full project verification commands required by -`.github/copilot-instructions.md`: - -```shell -pixi run fix -pixi run check -pixi run unit-tests -pixi run integration-tests -pixi run script-tests -``` - -`pixi run fix` regenerates `docs/dev/package-structure-*.md` -automatically — do not edit those by hand. Accept the auto-fix output -and re-run `pixi run check` until clean. - -## Suggested Pull Request - -**Title:** Separate category-owning sections from real CIF datablocks - -**Description (end-user oriented):** - -This change reorganizes how the library models the different pieces of a -project that own CIF-like categories. Crystal structures and experiments -stay "real" CIF data blocks — each one is saved with its own `data_` -header, exactly as before. The `Analysis` section (fitting mode, -aliases, constraints, joint/sequential fit settings) and project-level -configuration are now treated as singleton sections that share the same -convenient features (category discovery, parameter listing, dirty -tracking, `help()` tables) without pretending to be data blocks. As a -result: - -- Saved CIF files keep their current layout: structures and experiments - start with `data_`; analysis and project configuration stay as - section files without a fake `data_` header. -- `project.parameters`, `analysis.parameters`, and - `structure.parameters` behave consistently and discoverably. -- Future project metadata cleanup can reuse the same base class. - -No user-facing API names change in this PR; existing tutorials, scripts, -and saved projects continue to work.