Skip to content

Raise nem_data_model_start_time floor from 2009 to 2015#82

Merged
nick-gorman merged 1 commit into
masterfrom
raise-start-search-floor-to-2015
May 25, 2026
Merged

Raise nem_data_model_start_time floor from 2009 to 2015#82
nick-gorman merged 1 commit into
masterfrom
raise-start-search-floor-to-2015

Conversation

@nick-gorman
Copy link
Copy Markdown
Member

@nick-gorman nick-gorman commented May 25, 2026

Summary

Cherry-picked from #67 (commit 66189a8) — one-line change to src/nemosis/defaults.py raising the nem_data_model_start_time floor from 2009/07/01 to 2015/01/01, with an inline comment explaining why.

Implements the short-term fix recommended in #66.

Why

nem_data_model_start_time is only consumed by tables with search_type = "all" in processing_info_maps.py — i.e. cumulative-snapshot tables: PARTICIPANT, DUDETAIL, GENCONDATA, SPDREGIONCONSTRAINT, SPDCONNECTIONPOINTCONSTRAINT, SPDINTERCONNECTORCONSTRAINT, LOSSMODEL, LOSSFACTORMODEL, MNSP_INTERCONNECTOR, INTERCONNECTOR, INTERCONNECTORCONSTRAINT, MARKET_PRICE_THRESHOLDS (12 tables total).

For these, _set_up_dynamic_compilers sets start_search = nem_data_model_start_time, then _dynamic_data_fetch_loop iterates monthly from there to end_time. With the floor at 2009-07, the loop probes 72 months of pre-2015 URLs — all of which 404, because AEMO restructured the 2009-2014 archives into one zip per month containing all tables, vs the per-table monthly zips NEMOSIS expects from 2015 onward. As #66 notes: "Nemosis just throws an error, then moves on". Raising the floor skips that doomed scan window.

This is a speedup with no incremental data-loss impact — pre-2015 data already doesn't work on master today. The change just makes the failure faster (no minutes of 404 retries) and trims log noise.

What's not affected

  • The 24-ish tables with search_type = "start_to_end" or "end" — those derive start_search from the user-supplied start_time/end_time directly, not from this default.
  • Users who genuinely want to override the floor for testing — tests/end_to_end_table_tests/test_*.py already monkeypatch defaults.nem_data_model_start_time = "2021/05/01 00:00:00" to narrow scan windows; that path still works.

Future work

#66 keeps tracking the long-term aspiration of parsing the whole-of-month zip format AEMO uses for 2009-2014, which would restore access to those years. This PR doesn't close #66 — it implements the short-term mitigation Matt explicitly noted in that issue while leaving the deeper fix open.

No README change

README doesn't currently document a date floor anywhere, and this isn't a behavior change so much as a perf fix for an already-broken case. The inline source comment + #66 is sufficient context for future readers.

Test plan

  • uv run pytest tests/ — 367 pass, 1 skipped, 3 warnings (all pre-existing, unrelated)
  • CI green

Related: #66, supersedes the same change from #67.

@nick-gorman nick-gorman merged commit 5a221d7 into master May 25, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handle whole-of-month zips for 2009-2014

2 participants