feat(kalshi): add ohlcv_hourly model#9551
Conversation
Add two new Kalshi prediction market spells built from API bronze tables: - kalshi.market_details: market reference table joining markets_0003 with event metadata from market_details_0003. Filtered to markets with >= 100 contracts traded (6.5M of 39.8M markets, 99.7% of volume). Drops 12 universally null/constant columns (55 → 43). - kalshi.market_trades: trade-level view enriched with market metadata via inner join to market_details, filtering out dust market trades. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rename sources from _0003 to _raw (market_trades_raw, markets_raw, market_details_raw) - Update contributor from dpettas to allelosi Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add two new Kalshi prediction market spells built from API bronze tables: - kalshi.market_details: market reference table joining markets_0003 with event metadata from market_details_0003. Filtered to markets with >= 100 contracts traded (6.5M of 39.8M markets, 99.7% of volume). Drops 12 universally null/constant columns (55 → 43). - kalshi.market_trades: trade-level view enriched with market metadata via inner join to market_details, filtering out dust market trades. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rename sources from _0003 to _raw (market_trades_raw, markets_raw, market_details_raw) - Update contributor from dpettas to allelosi Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Hourly OHLCV candles for Kalshi prediction markets: - Built on yes_price_dollars from kalshi_market_trades - Forward-fills no-trade hours via ASOF join on utils.hours spine - Resolution-corrects close prices for settled markets (1.0/0.0) - Extracts category from product_metadata JSON - Full history coverage (2021+) - Aligned output schema with polymarket_polygon.ohlcv_hourly Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… in details - market_trades: add amount_usd (yes_price_dollars * count_fp) and _updated_at - market_details: extract category from product_metadata JSON Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… in details - market_trades: add amount_usd (yes_price_dollars * count_fp) and _updated_at - market_details: extract category from product_metadata JSON Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…sion refresh Made-with: Cursor
…pose_spells hook Made-with: Cursor
|
pushed some changes to get the we are also blocked here until we finalize the other PR which builds the upstream kalshi tables. they also exist in this PR, but not with latest changes. |
PR SummaryMedium Risk Overview Refactors Reviewed by Cursor Bugbot for commit 72b2402. Configure here. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Resolution correction makes close exceed high/low bounds
- Applied resolution correction to open, high, and low in addition to close to maintain the OHLCV invariant for post-expiration settled markets.
Or push these changes by commenting:
@cursor push 4216e8e531
Preview (4216e8e531)
diff --git a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_ohlcv_hourly.sql b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_ohlcv_hourly.sql
--- a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_ohlcv_hourly.sql
+++ b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_ohlcv_hourly.sql
@@ -116,9 +116,6 @@
f.ticker,
f.market_name,
f.event_ticker,
- f.open,
- f.high,
- f.low,
case
when m.expiration_time is not null
and f.hour > m.expiration_time
@@ -127,6 +124,42 @@
case
when m.result = 'yes' then 1.0
when m.result = 'no' then 0.0
+ else f.open
+ end
+ else f.open
+ end as open,
+ case
+ when m.expiration_time is not null
+ and f.hour > m.expiration_time
+ and m.result in ('yes', 'no')
+ then
+ case
+ when m.result = 'yes' then 1.0
+ when m.result = 'no' then 0.0
+ else f.high
+ end
+ else f.high
+ end as high,
+ case
+ when m.expiration_time is not null
+ and f.hour > m.expiration_time
+ and m.result in ('yes', 'no')
+ then
+ case
+ when m.result = 'yes' then 1.0
+ when m.result = 'no' then 0.0
+ else f.low
+ end
+ else f.low
+ end as low,
+ case
+ when m.expiration_time is not null
+ and f.hour > m.expiration_time
+ and m.result in ('yes', 'no')
+ then
+ case
+ when m.result = 'yes' then 1.0
+ when m.result = 'no' then 0.0
else f.close
end
else f.closeYou can send follow-ups to the cloud agent here.
Adds QUALIFY ROW_NUMBER() to keep only the latest row per event_ticker from market_details_raw, preventing potential duplicate ticker rows if the raw source ever contains multiple snapshots per event. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
QUALIFY is not supported in Trino/DuneSQL. Rewrote event_details deduplication as a subquery with ROW_NUMBER() + WHERE rn = 1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace JSON extraction `try(json_extract_scalar(product_metadata, '$.category'))` with the native `category` column now available in market_details_raw (100% populated, 19 distinct values). - Add `mve_collection_ticker` from markets_raw to the gold layer. 81% of Kalshi markets are multivariate events (MVE); this column links sub-event markets to their parent MVE collection (e.g., KXMVECBCHAMPIONSHIP-R), enabling downstream grouping and filtering by collection. - Update _schema.yml descriptions accordingly.
- Point all sources from *_raw to *_0004 tables - Remove 2025 date constraint from OHLCV (full history) - Cap hour spine at current hour (matches polymarket behavior) - Use native category column from market_details_0004 instead of json_extract_scalar on product_metadata Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- kalshi_market_details: scope event_details dedupe via inner join to markets CTE (avoids full market_details_raw scan on incremental runs); switch to explicit column projection; rename watermark_ts -> source_updated_at to avoid confusion with pipeline-oriented _updated_at; add _updated_at = now() for operational freshness; drop post_hook. - kalshi_market_trades: explicit column projection; leading commas; consistent inner-join pre-scoping; update reference to source_updated_at; add merge_skip_unchanged = true to skip no-op dimension refreshes (matches polymarket analog); drop post_hook. - _schema.yml: rename watermark_ts column, add _updated_at to market_details, refresh descriptions. Made-with: Cursor
…atic tag - Drop kalshi_market_details.sql and kalshi_market_trades.sql — those land in PR #9549 instead; this PR depends on #9549 merging first. - Revert sources/kalshi/_sources.yml (raw source declarations come from #9549). - Trim _schema.yml to only the kalshi_ohlcv_hourly entry. - Remove `static` tag from kalshi_ohlcv_hourly (config + schema) so the model refreshes with new data. - Also remove `static` tag from polymarket_polygon.ohlcv_hourly for consistency.
Follow-up to 2e6d394 which landed only a partial snapshot. This commit completes the review: - kalshi_market_details: drop ci-stamp and post_hook; switch event_details pre-filter from `in (select ...)` to inner join (consistent with trades); add `now() as _updated_at` for pipeline-time freshness. - kalshi_market_trades: drop ci-stamp and post_hook; reformat config block to single-line style; add merge_skip_unchanged = true; explicit column projection; leading commas throughout; newline after SQL keywords. - _schema.yml: document _updated_at column on kalshi_market_details. Made-with: Cursor
…s' into feat/kalshi-ohlcv-hourly Made-with: Cursor # Conflicts: # dbt_subprojects/daily_spellbook/models/_projects/kalshi/_schema.yml
…rd-fill anchor
Preserves main-branch logic while switching materialization from table to
incremental merge. Adds a prior_sparse CTE that re-reads real-trade aggregates
from {{ this }} outside the incremental window, unioned with new sparse
aggregates so market_bounds and the asof forward-fill stay correct across the
window boundary. Adds block_month partition, _updated_at, and destination
incremental_predicate for target pruning.
Made-with: Cursor
|
@cursor review |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Resolution correction applied to all OHLCV fields inconsistently
- Removed resolution correction from open, high, and low fields in Kalshi model to match Polymarket's behavior of only correcting the close price.
Or push these changes by commenting:
@cursor push ac418747a5
Preview (ac418747a5)
diff --git a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_ohlcv_hourly.sql b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_ohlcv_hourly.sql
--- a/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_ohlcv_hourly.sql
+++ b/dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_ohlcv_hourly.sql
@@ -147,32 +147,14 @@
f.market_id,
f.outcome,
f.market_name,
+ f.open,
+ f.high,
+ f.low,
case
when m.expiration_time is not null
and f.hour > m.expiration_time
and m.result in ('yes', 'no')
then case when m.result = 'yes' then 1.0 else 0.0 end
- else f.open
- end as open,
- case
- when m.expiration_time is not null
- and f.hour > m.expiration_time
- and m.result in ('yes', 'no')
- then case when m.result = 'yes' then 1.0 else 0.0 end
- else f.high
- end as high,
- case
- when m.expiration_time is not null
- and f.hour > m.expiration_time
- and m.result in ('yes', 'no')
- then case when m.result = 'yes' then 1.0 else 0.0 end
- else f.low
- end as low,
- case
- when m.expiration_time is not null
- and f.hour > m.expiration_time
- and m.result in ('yes', 'no')
- then case when m.result = 'yes' then 1.0 else 0.0 end
else f.close
end as close,
f.vwap,You can send follow-ups to the cloud agent here.
Made-with: Cursor # Conflicts: # dbt_subprojects/daily_spellbook/models/_projects/kalshi/_schema.yml # dbt_subprojects/daily_spellbook/models/_projects/kalshi/kalshi_market_trades.sql
…eserve invariant
Polymarket previously corrected only close, which violated low <= {open, close}
<= high on post-expiration forward-filled hours (e.g. open=0.73, close=1.00).
Widen correction to open/high/low so every candle holds the OHLCV invariant,
aligning Polymarket's behavior with Kalshi. Factor the settlement value into a
single settled_price column consumed via coalesce; Kalshi refactored to the
same shape for structural symmetry (output unchanged).
Made-with: Cursor
|
@cursor review |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 72b2402. Configure here.
…ion fires The prior `try_cast(substring(market_end_time from 1 for 19) as timestamp)` silently returned null for values like `2025-12-31T00:00:00Z` (Trino's timestamp cast does not accept the `T` separator), so `market_end_time_ts` was always null and the resolution CASE never applied. Switch to `try(from_iso8601_timestamp(market_end_time))` so ~8.9M post-expiration rows on resolved markets are correctly forced to 0.0/1.0 across all four OHLC fields. Made-with: Cursor

Summary
Adds
kalshi.ohlcv_hourly— hourly OHLCV candles for Kalshi prediction markets. Depends on #9549 (market_details + market_trades).What it does
kalshi_market_tradeswith deterministic ordering (created_time,trade_id)utils.hours, forward-fill viaASOF LEFT JOINis_forward_filledflag, VWAP null on no-trade hoursproduct_metadataJSONpolymarket_polygon.ohlcv_hourlyfor cross-venue consistency (BlackRock trial)Kalshi vs Polymarket OHLCV differences
condition_id/token_outcome— usestickeras market IDoutcomeis always 'Yes' (each Kalshi ticker is a single market; No price = 1 - Yes price)created_time+trade_id(off-chain, no block_time/evt_index)resultfield +expiration_time(vs Polymarket'smarket_outcome+market_end_time)Test plan
dbt compilepasses(hour, market_id, outcome)🤖 Generated with Claude Code